Catalogues, Web Stores and Databases on electronic media

I.

Summary

There are many reasons why you may want to put information on CD: to reduce environmental damage, to save costs, to offer large volumes of data or to use the speed and power of electronic searches.

These same benefits also apply to putting your catalogue on the internet, and many of the points made in this paper apply equally well to that alternative. However there are many instances where the large data capacity and portability of a CD or DVD make it the logical choice.

In actual fact the best solution will almost certainly be a consolidation of data rather than the introduction of yet another medium. In this model the database containing the source information for the catalogue can be accessed over the internet and the publication process is simply a matter of copying the website and database to CD.

II.

Introduction

In this paper I briefly cover some of the issues involved in producing a CD or DVD containing a product catalogue. Many of these issues apply to other types of data CD and, indeed, other forms of electronic storage such as the internet.

Although I cannot attempt to offer a comprehensive, detailed analysis I do work systematically through some of the points that I have personally found to be important.

I have tried to be fair and balanced with my comments but my objectivity is constrained by the fact that I am involved professionally with PHDCC. I am therefore more familiar with their software solutions than some of the alternatives that I mention.

III.

Appropriate Stepwise Action

It is very important to review at the outset where you are now, where you would like to get to in the end and what are the reasonable intermediate steps. The timescale should take into account difficulties likely to be encountered and time to learn from problems with the previous step.

The starting point will include such issues as:

Is the current publishing process entirely sub-contracted?
How much of the formatting of data is done in house?
What conversion is required (database, spreadsheet, word processor to desk top publisher, Windows to Mac)?
Do you already have a web shop?
Do you already have an internet or intranet database system that provides the information for your catalogue?

These questions have a range of complicated answers and the possible steps that arise will have to be thought through carefully taking into account other factors mentioned in this paper. However the spectrum of possibilities includes:

Take the existing information (as sent to the publisher) with the minimum of file conversion, add a start page and a search engine. If you do this you should not assume that the user of your CD will have the applications required to view the documents (such as MS Office) so provide viewers on the CD. (There is useful information on this topic on www.shellrun.com)
Convert the existing information to one or more documents (pdf or html, say) which are searchable and can be used on a wide variety of operating systems. You might need to write or acquire some special software to convert the file format to suite your needs. (ie break a large document into smaller units or visa-versa)
Export the information in the form of a database that can be run from a CD
Export the information to a web style database which could be run from the CD using a server capable of running from CD. At this point it would make sense to simultaneously publish the package onto the internet.
Maintain an on line shop with stock control and upload the information to CD from time to time. Again using a CD web page server.

IV.

Basic Issues

(a) Searchablity

The most import aspect of any CD you produce (at least from the perspective of the user) is the facility to find things quickly and easily. Although in some circumstances a structured layout would be sufficient it is always good practice to have a search engine and, often, this can be made the primary access point to the CD.

The search engine you use will depend on the type of CD you opt for:

Raw files as sent to the publisher will need a search engine that can cope with different file types. Many desk top publishing formats will not be supported but Spy-CD (with which I am most familiar) can index web pages, pdf, Word, Excel, PowerPoint, Image files and plain text.

Converting the documents to a more portable form will increase the number of search engine options available.
By putting your information in a database (or including the existing database on the CD) you will be able to use the built-in database search facilities. These will not search all the documents on the CD (which could well be a benefit as well as a disadvantage) and probably will not have the sophistication of searches built into search engines.

If you opt for putting a web site and web server on the CD you can use both the database and the search engine for searching.

(b) Cost

With any project of this type the largest cost will be your time and you should use realistic estimates for your hourly rate including relevant overhead allocations. The total is typically more than double your wage! Use this figure when you make the original cost benefit justification as well as for decisions over which software to use or whether to outsource work.

The amount of work involved setting up the new system will depend on your starting point and which solution you are aiming at. Generally it is a good idea not to be too ambitious with the first step but a couple of thousand USD would not be out of the way to account for the in-house work involved setting up a system to publish existing material. Setting up a web shop or internet database system can be a steep (but rewarding) learning curve and you should think seriously about getting fixed price quotations from external contractors.

The right software tools can reduce the work content of the project significantly. Many are available free or open source such as the MySQL database and php script interpreter but these sometimes require more expertise to get working. You should spend some time searching the internet for alternatives and look at forums where people have commented or provided help. A one off cost of between zero and fifteen hundred USD will cover the majority of installations although some software operates with an annual or per CD licence fee.

CD publishing is one of the lesser costs which is why you are looking at distributing information on CD! The unit price falls with quantity but a production run of 500 CDs might cost 3.00USD per CD for duplication, on-body 4 colour screen print, Standard jewel case, litho printed booklet and inlay. Again you should shop around for an appropriate deal.

Don't forget postage and packing

(c) Time to produce
(i) Initial one-off

The time to set up your system in the first place will depend on the factors discussed above and how much time you can allocate to this project. If you use some software with which you are unfamiliar you need to build in some learning time. It's a good idea to look at the software supplier's web site for clues about their technical query response time (such as customer comments). In some instances an email dialogue can take weeks to solve a straightforward problem!

Jobs that can often absorb large amounts of time include things like setting up data conversion processes (especially if complex formatting is required for the output at the same time). Obviously if you decide you need to review the architecture of the business management software there is, potentially, a great deal of work see the section below.

(ii) Routine per publication

The objective should be to reduce the work that has to be done for the production of each catalogue. Ideally it should entail simply the press of one button. However it is probably better to err on the side of simplicity at the beginning and add automation later. The critical thing is to get the data structures and processes right, run them through manually to start with, sort out any bugs, and only then convert them to programs.

(d) User friendliness
(i) Automatic starting
You must assume the absolute minimum about the eventual user of your CD. They probably will not be highly computer literate, they might not have a very modern computer, they might not be using the latest version of Windows, they might not be using Windows at all, and they might even have a poor understanding of English! At some point you will have to draw the line but you should have gone through the thought process and be clear about the justification for your decisions. You should prepare yourself for the consequences.

Ideally the CD will auto-run when it is put into the PC. If some of your users run Mac or Unix type systems then you will need to make special provisions to get them to auto-run. Many computers will have auto-run switched off anyway so you need to give clear written instruction for the user in this eventuality.

One of the most versatile file formats to use on your CD is HTML web pages as these will run on all machines with a fairly standard appearance. However you will need a special program such as ShellRun to start the user's default browser.

If you need to run a server from the CD, such as Dynamic-CD, you will find that you are limited to Windows systems (W95 will usually need some upgrades to work). If a significant proportion of your target audience uses non-Windows machines then you will have to put different versions of a server such as Apache on the CD. You will almost certainly need to use manual intervention to some extent and install some files on the user's hard disk. (Assuming the machine has one!)

(ii) Non Installation of software
If at all possible don't require your users to install the information onto their hard disk. Many people will we worried about filling up the space, causing conflicts that could crash their computer, have a company policy about not putting information on PCs (without authority from the IT department) or just not want to hang around waiting for the process to complete.

In some instances you will need to copy some files onto the user's computer. For instance MySQL and Apache need to be able to write to certain log files, so these files obviously cannot be on a CD. They should go into the %TEMP% folder and not onto C:/somethingorother folder, as the C: drive may not exist and it needs to be clear that these files can be deleted at a later stage to release disk space. If at all possible automate the checking and copying process using batch or script files but make sure you provide information about what you are doing. You will find helpful ideas on this topic on the phdcc web site.

(iii) Virus Warnings
If you find that you do need to run scripts that write to the hard disk, any correctly set up computer should give a Potential Virus warning. You need to warn your users that this is going to happen and tell them what to do. (Presumably allow the script to run).

(e) Packaging
Make your CD something to be cherished! With junk CDs being distributed all over the place you need to differentiate yourself. Make the printing attractive as well as informative; send the CD in an A4 ring binder sleeve so it can be filed somewhere other than the bin!

V.

More Comprehensive Solution

(a) Converting to database
As well as the issues mentioned above, there are a number of points that have to be considered if you decide to use a database on your CD. Most information needed for a catalogue can be presented as a single table and this makes some simplification options a possibility:

If the data used for your catalogue is not in the form of a database then you will have to transfer it. This process is probably best done by exporting the information to a text file. If you use commas to separate fields and put single quotes around text fields you will be able to import the data into most database systems. However you will probably get errors if there are apostrophes within the text so you will have to look at the procedures available for exporting and importing for each application.

Microsoft Access is a very common application for use on PCs and it provides a reasonably easy to use, graphical user interface. All Windows machines are shipped with drivers installed to allow other programs to query and update Access files on the computer. One advantage of using Access as a format for your database is that it then allows it to be read by application running on your CD without the need to install anything on the hard disk.

Alternatives to Access that are also supported by having drivers included on Windows systems are Excel spreadsheet and text file database formats. If your information is already in a spreadsheet or you have enough control over the export process to write your information to a suitable text file then you can access the information directly using SQL queries from web page scripts as described below. These alternatives may not be adequate for more complicated relational database structures but, as mentioned above, catalogues are generally in the form of a single table so they are worth looking at.

There are many other database systems available but most differentiate themselves by offering good performance and reliability with very large systems. This is obviously not relevant to a single user looking at information on a CD. What is more important, though, is reducing the need to convert data from one database system to another. For this reason you should probably opt to use the same database application on your CD that you use, or intend to use, for your on line database. In many instances this will also constrain the script language available MySQL with php scripting being the most common (they are also available free for you to use on your CD) or Access with ASP scripts.
If your information is on an Access database you could simply include this on the CD. The problem with this option is that not everyone has Access installed on their computer. You could provide a viewer application but most of these need to be installed on the user's computer and you would be better off using one of the server options below with a script to read from the database.

Some database languages provide a way for developers to embed the database into an application. i.e. you have a program on your CD that starts automatically and provides screens for entering search criteria and presenting the results. This alternative can give you a lot of control over the way your CD behaves but it is normally very intensive in terms of the level of programming required.

(b) Servers on CD
There are a few applications that you can include on your CD that will allow it to operate as an HTML server and run active web pages with scripts. The reason this is an attractive alternative is that it allows you to transfer your existing information in the form of web pages but with database functionality. Different software approaches the problem in various ways:

Some systems create an executable file that you can put on the CD. The file is a compiled form of all the web pages and scripts in the site. Normally any databases are not compiled but kept as separate files. The alternative approach has the server running on its own interpreting the various files in the way a normal web server would.
Some servers provide basic functionality and add .ASP or .PHP interpretation as filters. Others have built in script interpretation.
Some systems can work with encrypted files or folders, other can't.
Some are aimed at reasonably low level programmers and provide powerful solutions at the cost of a steep learning curve, others are simple to use with straightforward graphical interfaces.

All these servers should be able to run from the CD without installing more than a few temporary files on the hard disk however there are two other possibilities that are worth considering: Java and .NET. Both these languages require an application (Java virtual machine or .NET framework) to be installed on the user's computer. The difference is that this application is of general use it's a bit like an extension of the operating system and can, in theory, be used by lots of other applications in the future. (A bit like having a Flash player installed) Both Java and .NET have ways of serving web pages and reading from the databases mentioned above. .NET is Microsoft's baby and is being heavily pushed but Java will almost certainly run on more non-Windows systems.

At the end of the day you should be guided by the systems already in use within your organization or supported by your Internet Service Provider. There's no point writing a Java applet to access your database on CD if you will have to convert it all to MySQL and php to publish it on the web at some later date.

VI.

Going Back a Step

(a) A web accessed database

While you are reviewing the various possibilities for putting information onto a CD or DVD it is a good opportunity to look over your general business management software. In many instances this system will represent a large investment of time and money and a decision to switch to a new system will have profound consequences for your business if it doesn't work out!

However the world of computer systems is constantly changing and what was futuristic five years ago is now well established, safe technology. As well as this, the business environment is evolving with peripheral systems for quality, health and safety and environmental control being integrated into the core database model. And finally: sales over the internet are growing in quantity and acceptability with most ISPs offering free on-line shopping cart facilities.

The advantages of putting your main business software on the internet are:

Cost. All the server expenses such as initial purchase, upgrades and maintenance are done by someone else for a few USD per month. Plus you only need to buy basic PCs for your office that can run a browser; you won't need to keep throwing out perfectly good machines just because they can't run the latest version of Windows.
Flexibility. If you make the wrong decision with your ISP you can always upgrade or switch to another one.
Accessibility. You will be able to send and receive information anywhere in the world, even on your mobile phone.
Safety. A professional service provider ought to be able to give better server up-time with a lower risk of catastrophic data loss than you can.

Sales. Your customers can now place their own orders on your system, check up on work in progress and stock availability, and settle invoices. However:

The disadvantages are:

Your customers will be able to check up on your delivery promises!
Having to re-write software.
Transferring data.
Because the system is on-line you will have to be much tighter with userid/password access than most small businesses bother with!
You will be stuck if the telephone system crashes.

Although there will be a sizeable amount of work involved in a project of this type there are several sophisticated tools available for converting existing database systems or generating web pages from an existing database structure. There are even complete open source systems that can be brought on-line quickly then modified to suit your requirements.

VII.

Conclusion

Putting a catalogue on CD or DVD can involve a lot of issues but if it is done in a methodical way, using the appropriate resources it can be done quickly and without undue expense.

VIII.

Software used previously

This is not exhaustive; simply a list of things of which I have personal experience. You should search on the internet for comments and alternatives.

(a) Databases
Access, MySQL, SQLite

(b) Search Engines
FindinSite, search maker, swish-e, zoom

(c) Servers
Dynamic-CD, Abyss, Apache, Xitami

(d) Script languages
ASP, PHP, Perl, .NET, Java

(e) Useful tools
ShellRun (autorun application for Windows), phpMyEdit, ASP/PHP Web Application Builder, Code Charge Studio (generate screens from a database) Nola (accounts, stock and sales control using php and MySQL)