BotGadget: The Eloquence of the Barcode

Beep. You hear it every time you buy a product in a retail store. The checkout person slides your purchase over a scanner embedded in their checkout stand, or shoots it with a handheld scanner. The familiar series of bars and spaces on the label is digitized, decoded to digits, and then used as a query to a database of every product that particular store sells. It happens so often that we take it for granted. Modern barcodes have been around for 41 years now. The first product purchased with a barcode was a 10 pack of Juicy Fruit gum, scanned on June 26, 1974 at Marsh supermarket in Troy, Ohio. The code scanned that day was UPC-A, the same barcode we’re used today on just about every retail product you can buy.

The history of the barcode is not as cut and dry as one would think. More than one group has been credited with inventing the technology. How does one encode data on a machine, store it on a physical media, then read it at some later date? Punch cards and paper tape have been doing that for centuries. The problem was storing that data without cutting holes in the carrier. The overall issue was common enough that efforts were launched in several different industries.

In the 1930’s, John Kermode, Douglas Young, and Harry Sparkes created a four bar barcode. They were Westinghouse engineers, and not surprisingly the application was to automate the payment processing of electric power bills. The patents however, were generalized as “Card sorters”.

In 1948, Bernard Silver and Joseph Woodland began work on a system for reading linear and circular printed codes for supermarkets. They took their inspiration from optical audio tracks used in 16mm and 35mm film. In fact, their reader employed an RC935 photomultiplier tube normally used in movie projectors. Silver and Woodland are often credited as inventors of the barcode, but they reference the Westinghouse patent in their own work. Several companies including IBM took interest in the patent, but determined that key technologies still needed to be developed before it would be a practical system. Philco bought the patent, eventually selling it to RCA.

Perhaps the most infamous claim to the barcode throne came from Jerome H. Lemelson. Lemelson was granted over 600 patents in his lifetime, including some for machine vision. Many of these were considered submarine patents. He made most of his fortune by enforcing and licensing those patents to the tune of 1.3 billion dollars. This branded him as an early patent troll. Lemelson’s barcode patents were declared unenforceable in a landmark 2004 court case against Cognex Corporation and Symbol Technologies. This case is often referenced in patent troll litigation today.

upca What we know as the modern barcode got its start in the late 1960’s. Local markets were evolving into supermarkets. Checkout systems with mechanical cash registers were the obvious bottleneck. But how to speed things up? Grocery trade associations created the Uniform Grocery Product Code Council (now GS1) to tackle the problem. GS1 solicited solutions and received proposals from RCA, IBM, Singer, Dymo, Litton, and Pitney Bowes, among others. RCA drew on the Silver and Woodland patent to create bulls-eye code. IBM may not have had the patent, but they had something better. Joseph Woodland had been an IBM employee for several years at that point. he was recruited to a team which included George Laurer. Laurer is still active in the industry, maintaining a webpage with information about barcodes. The team worked hard to design a robust code. In the end it was the IBM code that became the Universal Product Code (UPC) we all have come to know.

madmag The UPC symbology has remained relatively unchanged since 1974. There have been some extensions to encode extra data, but the core has endured as a long-lasting standard. Once the code was in use, a revision would require massive changes from the printing industry all the way through the point of sale industry.

Building a Barcode

UPC-A is a numeric only symbology. It’s also a fixed width. Each UPC-A symbol encodes twelve digits, however one digit is used as a check character, leaving only eleven usable digits. The framework of the code starts with a quiet zone, which is literally a quiet area with the same color as the spaces. Just inside the quiet zone a guard bar, which is a unique pattern that defines the start (or end) of the code. UPC-A has quiet zones and guard bars at the start and end of the code. A unique center guard bar defines the middle of the symbol. The rest of the code is made up of twelve characters.

upc-expand

To envision how a UPC-A encodes data, think of morse code. If one drew all the dots and dashes of a morse code message, they would have a rudimentary barcode. In practice, the Morse character set doesn’t work very well because it uses variable length characters. A ‘T’ is one dash, while a ‘Y’ is three dashes and a dot. Determining where one characters ends and another begins would require spaces to be added between every character. That works in Radio communications, but becomes inefficient on the printed page.

Characters – It’s all in the widths

Individual UPC characters are also fixed width. The basic unit of length is called a module, which represents the smallest bar or space used in the symbology. The nominal module size used by UPC-A is 0.33 mm. Each UPC character is made two bars and two spaces, with a total length of 7 modules. For the digit 0 on the left side of a UPC-A, the character is 3,2,1,1 – meaning a space 3 modules wide, followed by a bar 2 modules wide, then another space and bar two modules wide each.

Characters on the right of the center guard bar are color inverted from those on the left. That means every character on the left side starts with a space, while every character on the right starts with a bar.
Why all the complication with two inverted character sets? Direction! The grocery checkout barcode scanner hasn’t changed much over the years. It’s mounted in a slot and items are passed over it. The barcode can be in any orientation, so the scanner has to be able to decode the symbol left to right, right to left, or at nearly any angle.

The important thing to remember is that reading a barcode is that it’s all about relative widths. With a handheld scanner, the barcode can be at any reasonable distance from the scanner. A more distant barcode will appear smaller than a close one. A reader simply has to compare the smallest element it sees (the module) to the width of the other elements. Once these relative widths conform to the rules of the quiet zone and guard bars, the reader decides it has found a possible code and begins to look for characters.

UPC-A may have been the first commonly used barcode, but it didn’t stand alone for long. Europe modified the spec, adding a digit. The resulting code was called EAN-13. EAN codes all include a three digit country code. An odd side effect of this was the creation of the fictional country “Bookland”, which is used for books and other publications.

Today, there are dozens of different barcode symbologies out there. Commonly used linear symbologies include Code 39, Code 128, GS1 DataBar, Interleaved 2 OF 5, MSI Plessy. When one line isn’t enough, 2D symbologies are used which include PDF-417, Aztec code, Maxicode, Datamatrix, and QR code. We take for granted how easy it is to scan a code and jump to a webpage – but it all started with the simple UPC.

Barcode images from Wikipedia.

Filed under: Featured, misc hacks

BotGadget

Monday, November 2

The Eloquence of the Barcode

Building a Barcode

Characters – It’s all in the widths

No comments:

Post a Comment