Simplicat XML Catalog Publisher

Sample Alphabetical Catalog

Sample Categorical Catalog

Download

Documentation

About Us

Contact

What is Simplicat?

Simplicat XML Catalog Publisher is a small-scale batch-process print catalog publishing system built using XML, XSLT, XSL-FO, CSS, and related standards. It accepts an input XML file containing product data and product grouping/table shape definitions, and a CSS style sheet, and outputs a complete product catalog in PDF format.

Simplicat is still under development and is not yet released; please contact us at info@intysr.com for more information..

Major Features

Simplicat is built around industry-standard open-source components including Saxon XSLT for data manipulation, and Apache FOP for PDF generation. Its current features include:

  • regular product listings with name, description, image, pricing;
  • categorical or alphabetical layout from the same input XML;
  • single and two-column page layouts, with interspersed column- and page-wide listings in 2-column mode;
  • sidetabs based on product category (categorical) or first letter of product name (alphabetical);
  • running category and product name headers;
  • assembly of multiple related products into group listings;
  • generation of tables to summarize group listings based on user-defined table shapes;
  • table-of-contents generation (including use of PDF bookmarks);
  • alphabetical product index and product number index generation; and
  • user supplied title image page.

It is best suited to industrial and directory-style product catalogs that are data intensive but with a constrained set of listing styles. It is generally suited for catalogs with up to 2500 products, and up to two levels of product categories. It is not intended for glossy consumer-oriented catalogs.

Input XML

The system accepts product data in XML form. This data should include product name, sku or product number, category, description, image, attribute values (such as color, size, weight, etc.), and pricing data. It is very similar in content to the CSV export files produced by major eCommerce shopping cart systems, including Magento, osCommerce, and OpenCart, and in fact this XML can easily be produced from these export files (an example is TBD).

Here is an example of an input XML product record:

<?xml version="1.0"?>
<products>
  <product id="P1">
  <name>Thermostat</name>
  <sku>P001</sku>
  <category>Household Products</category>
  <desc>This is the description for a small thermostat.</desc>
  <img src="thermostat_sm.gif"/>
  <attributes>
    <attribute name="size">small</attribute>
    <attribute name="color">white</attribute>
  </attributes>
  </product>
</products>

In addition to product data, the catalog building process requires other specifications in order to proceed. These include:

  • the title of the catalog;
  • the company name;
  • rules for how to combine products into groups;
  • rules for how to build tables; and
  • CSS style information for the major catalog elements such as category names, product titles, tables, etc.
Regular Product Listings

Simplicat allows CSS styles to define the appearance of product titles, description, attributes, etc. For example, in our example catalog, the title is shown in 14pt white text on a grey background; attributes are shown in name-value pairs separated by dot leaders, and prices are tabled in 3 columns showing SKU, quantity, and price. Here's a sample product listing based on this style (and others):

Small Thermostat

This is the description for a small thermostat.

size...small
color...white

Item Number Quantity Pricing
P001 1-10 $10.00
P001 11+ $9.00

Page Layout Features

Single or Two-Column Layout

Product listings can be shown in a one-column layout that spans the width of the page, or in a two-column layout with listings flowing from the left-side column into the right-side.

Columnwide and Pagewide Listings

In two-column mode, some listings can be explicitly marked to override the columns and span the entire page. These can be interspersed arbitrarily among the other column-wide listings (the columns are balanced before each such pagewide listing). In addition, group listing definitions can be specified as pagewide as they are created by the system.

Floating Text Around Images

By default, items within a product listing are assembled sequentially as the page flows down from the top. Typically a title is shown, followed by the image, then a description, attributes, and pricing. However, many times it is desirable to wrap all or part of the textual content around the image. The Apache FOP formatter currently does not support floating text; however, as a workaround, Simplicat provides a tabled listing feature, where an image can be shown in its own column next to the first paragraph of description text (with subsequent paragraphs set underneath).

Headers and Footers

Page headers and footers are each user-defined as containing 3 column-areas (left, middle, and right) that can have content populated from the following:

  • null (blank);
  • Current (running) category title;
  • Current (running) first product name on page;
  • Company name;
  • Page number; and/or
  • Custom text such as company website address, email, and/or phone number.

Sidetabs

Sidetabs can be created containing the category title (categorical) or first letter of the product name (alphabetical). Recto and verso pages correctly alternate the printed position of the sidetab for proper printing. Sidetab width, height, and color are all user-defineable, and can be created using rounded borders for a professional look and feel. Sidetab positions "walk" the page down as they change until a user-defined maximum limit is set, the reset again to the top; so for example, it is possible for an alphabetical catalog to define:

  • all 26 letters as small sidetabs that proceed down the page from A-Z; or
  • an user-specified maximum (say 6) of fairly wide sidetabs, that repeat their positions 4+ times as they increment thru the 26 letters.
Group Listings

A signifcant feature of the system is the ability to combine multiple related products from their initial flat-file shape into groups. For example, consider a company that sells different types of thermostats. The thermostats share some characteristics but differ in other. The product XML might look like:

<product id="P1">
<name>Small Thermostat</name>
<attributes>
<attribute name="size">small</attribute>
<attribute name="color">white</attribute>
</attributes>

<product id="P2">
<name>Large Thermostat</name>
<attributes>
<attribute name="size">large</attribute>
<attribute name="color">white</attribute>
</attributes>

These 2 products can be combined into a parent product for all thermostats. The parent group listing can be constructed from two input sources:

  • from data common to all the child products, such as the "white" color attribute, that is automatically moved into the parent listing; and
  • additional inserted data, not part of the original product data, such as a new description specified for the entire group.

For example:

<product id="group1">
<name>Thermostats</name>
<desc>This is the inserted description for both products.lt;/desc>
<attributes>
<attribute name="color">white</attribute>
</attributes>

<product id="P1">
<name>Small Thermostat</name>
<desc>This is the description for a small thermostat.</desc>
<attributes>
<attribute name="size">small</attribute>
</attributes>
</product>

<product id="P2">
<name>Large Thermostat</name>
<desc>This is the description for a large thermostat.</desc>
<attributes>
<attribute name="size">large</attribute>
</attributes>
</product>

Here's how the group listing might be formatted:

  • the name for the group is displayed in white 14pt text on a grey background, giving it the same appearance as the example of the regular product above; and
  • the child product names are shown in smaller 11pt black text with no colored background, to illustrate their subordinate level in the group.

Thermostats

This is the inserted description for both products.
color...white

Small Thermostat

This is the description for a small thermostat.
size...small

Item Number Quantity Pricing
P001 1-10 $10.00
P001 11+ $9.00

Large Thermostat

This is the description for a large thermostat.
size...large

Item Number Quantity Pricing
P002 1-10 $10.00
P002 11+ $9.00

Table Generator

Standard groups listings like the one above are quite useful for many situations. However, they still can take alot of space in the catalog. To reduce the amount of space required, grouped listings can be summarized in a tabular format. Simplicat provides a table definition feature that allows a catalog author to specify simple XPath expressions for each column to be built. For example, here's a definition that constructs a table of showing the color, size, description, SKU, and price for each child product:

<tabledef id="table1" width="100%" border="1" cols="40 40 200 40 100" sort="color">
<column title="Color" xpath="attributes/attribute[@name='color']" autospan="true"/>
<column title="Size" xpath="attributes/attribute[@name='size']" autospan="true"/>
<column title="Description" xpath="desc"/>
<column title="Item Number" xpath="sku"/>
<column title="Pricing" xpath="pricing"/>
</tabledef>

Here's what the resulting listing looks like:

Thermostats

This is the inserted description for both products.

Color Size Description Item Number Pricing
white small This is the description for a small thermostat. P001 1-10 $10.00
11+ $9.00
white large This is the description for a large thermostat. P002 1-10 $10.00
11+ $9.00

Table Post-Processing

Once a basic child product table has been produced, Simplicat provides some additional table processing options that can create more complex and interesting tables. For example, here's a basic table that could have been built using the table generator:

Color Size Description Price
blue large product one $10.00
blue small product two $5.00
red small product three $5.00
blue small product four $5.00

We can now perform some additional post-processing steps on this table as part of the table specifications. These can include:

  • Row Span Generator
  • Row Header Generator
  • Column Merge
  • Pivot Table

Row Span Generator

This step combines common column values into one column that spans rows. For example, in the table above, the "blue" colors will be combined. This feature is somewhat limited in scope currently (limited to row spans only, and only for the first 3 columns) but we are working to improve it over time.

Color Size Description Price
blue large product one $10.00
small product two $5.00
product four $5.00
red small product three $5.00

Row Header Generator

This step creates row headers that span all the columns for each distinct value of a column, and then removes that column from the table. For example here is the table produced by calling the row header generator for column 1 of the example table above:

Size Description Price
blue
large product one $10.00
small product two $5.00
small product four $5.00
red
small product three $5.00

Column Merge

This step merges the data in 2 columns into one column and renames the resulting column header. For example, here's the result of combining the first 2 columns of the example table into a single generic "Type" column:

Type Description Price
large blue product one $10.00
small blue product two $5.00
small red product three $5.00
small blue product four $5.00

Pivot Tables

Another post-processing option is to "pivot" the table on one column, using its distinct values as new column headers. For example, here is a pivot table created from the example above using the "Description" column:

product oneproduct twoproduct threeproduct four
Colorblueblueredblue
Sizelargesmallsmallsmall
Price$10.00$5.00$5.00$5.00

Successive Application of Table Post-Process Steps

The table post-process steps described above can be applied successively. For example, here's the table above with the colmerge(2,1) step followed by the rowheader(1) step:

Description Price
large blue
product one $10.00
small blue
product two $5.00
product four $5.00
small red
product three $5.00

Alphabetical and Categorical Layouts

The same product data can be published under two distinct layout geometries -- alphabetical and categorical. In alphabetical mode, all the products are sorted alphabetically by name, and a section is created for each letter that has one or more products. The (optional) sidetab for each section is simply the alphabetical letter for the category. In categorical mode, a section is created for each product category, and all products for that category are included in the section, and optionally sorted just within the section. The optional sidetab is the first word of the category title; for example, if the category is "Household Items", then the sidetab is "Household".

Rich Text

The system fully supports a set of standard HTML-style rich text tags that can appear with description, name, title, and other types of text within the product data:

Tag Name Description Example
b bold text this is bold text
i italic text this is italic text
u underlined text this is underlined text
sup superscript text this is superscript text
sub subscript text this is subscript text

Automatic Keeps Analysis

Another feature of the system is an automatic keeps analysis. For example, numeric values should always be kept with their units; values such as 150 mm should never allow a line break to occur between the "150" and the "mm". Although it is possible to manually specify this within description text by the use of an XML <keep> tag, most product export files will not have been authored this way. Instead the system provides an automatic check that inserts keep tags around numeric values followed by standard suffixes such as mm, cm, lbs., etc.

For example, prior to the PDF generation step, the system would transform the following XML product description:

<desc>This is a description for a 150 ml test tube.</desc>
into:
<desc>This is a description for a <keep>150 ml</keep> test tube.</desc>

This would constrain the PDF formatter (Apache FOP) from breaking the line between "150" and "ml".

Index Generation

The system automatically generates rear matter pages containing an alphabetical index of product names, and an SKU/product number index sorted by SKU. These can be included or excluded in the final PDF as desired. The indexes can also be produced as 1-column or 2-column as desired.

For example, here is the format of the alphabetical index produced (note that in PDF there will be true dot leaders instead of the simple examples here):

Further Information

Simplicat is still under development. We hope to provide an alpha release of the product in Spring 2010. For more information, including availablity, pricing, and support plans, please email us at info@intsysr.com.

All material Copyright © 2010 Intelligent Systems Research LLC. All Rights Reserved.

ISR LLC Home Page