Tag Archives: u.s customs data

Complete Transformation: Final Refinements & Enhancements Applied to U.S. Customs Data

Next along its path of transformation and enlightenment, Customs data is enhanced with a number of ancillary but related (and connectable) data.  Once the company location and name has been successfully resolved, its location can be assigned to a respective county, city, MSA, CSA, CBSA, congressional district, area code, time zone or even latitude-longitude.  Thereafter it can be integrated into a dynamic mapping application or be used in detailed geo related analysis and reporting.  Any number of “groupings” or consolidating factors can thus be applied.

GEO MAPPING of customs data after resolution of company name iterations, disparate locations and other aberrations.

Expanded information about the vessels (ships), containers (types, sizes), container owners, ports, carriers (SCAC reference) and any number of “connectable” relevant databases can also be linked during this step of the process. The possibilities are as infinite as imagination and business requirements dictate.

Referential database utilized to enhance normalized AMS Customs data. Click to enlarge and open in a new window.

At this point, the U.S. Customs data has been imported, organized, cleaned, groomed and dressed.  It is now ready for “prime time”.  It’s show time in the data world.  The data is now ready to move into an “exportable” mode.  Therefore, it is organized (and refreshed) within a distinct database of its own.

All the processes, detailed over the last several days, tens of thousands of individual BOLs, have been completed in a number of hours.  These routines run habitually every day.  All databases and processes – through the creation /update of the “AMS Trade Data Export” DB – occur internally and securely (behind the veil).

Now four processes, involving six databases have been completed.  Thereafter, two additional processes are initiated on a weekly basis. One integrates various client’s proprietary data with our completed Customs Data and updates the databases upon which their respective web applications sit.  The second process refreshes the reporting repository from which the host of commercially available web applications draw.

The Journey of Customs Data Transformation from Raw data to Trade Intelligence. Click to enlarge.

Transformed Customs data, integrated with other statistical, company trade and economic data sources can be a powerful tool to navigate and succeed within the multi-trillion dollar international trade marketplace.  A plethora of applications and services can be enhanced by the skilled utilization of Trade Intelligence.

Customs data tracks over $1 Trillion of U.S. Imports annually.

The collection of programs, procedures and referential databases with which we transform raw data into usable business intelligence we refer to as our “A.I.” (Artificial Intelligence Engine). It, along with our huge data repositories of statistical, company and transactional data collected over the years, together represent the primary assets (along with the human intelligence and experience by which to integrate, develop and deploy them) that we are offering for license, sale or joint venture consideration.

A.I. Artificial Intelligence Engine transforms data into intelligence to support international trade services and applications.

Please refer to our Commercial Services Menu on the top navigational bar of this site for information on Application Licensing, Research & Writing ServicesDatabase Repositories, Artificial Intelligence Engine as well as other Consulting, Project Management and Application Development Services.

U.S. Customs Data: Resolving the Enigmatic & Challenging Problems Inherent Within the Data.

Two of the most important normalization processes are accounting for the many iterations of company names and establishing an accurate company location.  See the previously published article, “The ABCs of U.S. Customs Data- Issues & Shortcomings“.  There can be many dozen iterations of the same company name.  This wreaks havoc with the veracity of the data under analysis.  The problem is evident is a cursory review of trade intelligence applications offered by most data vendors.

Text Strings converted into Data tokens for analysis and processing

In order to resolve these issues, the name and address fields contained on the bills of lading (for both shipper and receiver) are broken down into “tokens” and compared with a dynamically evolving referential database of “resolved” names and addresses.  Actually, accurately “geo-locating” the entity is the simplest of the two tasks. Zip codes, for the U.S. at least, follow a predictable pattern and typically occur at the end of the text string in the  “address” block of the flat file.

The two diagrams below are tables utilized within the fourth database involved in the third step of the transformation.  The first diagram shows elements that are utilized to resolve company location.  The second shows those necessary to resolve company name.

A separate, complimentary and very important utility – called the company-location resolver – is THE essential cornerstone of the A.I. (Artificial Intelligence) Engine and is required to dynamically evolve and “educate” the system.  More on that later.

Database used to "normalize" a location for an entity. This is the basis of the innovative geo-location feature utilized in the Prospects Trade Intelligence platform.

Normalization of the shipper or importer is essential to veracity in analysis and reporting. It's a challenging task to say the least. Most Purveyors of Trade Intelligence products using Customs Data don't even bother.

The location – company match utility is a very nifty accessory and vital component of the A.I. Engine.  Although the system is set up to quickly, accurately and automatically normalize U.S. Customs data, it also has the capacity to “learn” and improve its performance over time. Some of this learning takes place automatically over time as it gains more and more experience performing its daily processing rituals.  Adjunct education is interjected manually.

For instance, perhaps during the last several days/weeks/months processing routines, our A.I. Engine encountered some company name iterations that it hadn’t handled before and wasn’t in its library of established “tokens”.  Conveniently, it would display these unresolved iterations, ranked by the number of occurrences along with likely matches.  With one stroke an operator could resolve and match all particular aberrations or variations on a particular supplier or importer name or location… sometimes representing several hundred or thousand individual BOLs.

Thus the A.I. Engine learned something new.  And unlike its human counterparts, it will never have to ask the same question again.

Despite the sophisticated array of advanced technology deployed to automate the process of transformation, sometimes semi-automated manual intervention is called for.

The location – company match utility also can be used to link unlinked branch locations to their respective parent company or regional/ divisional headquarters.  Furthermore, it can process and link a proprietary client’s database of customers as well.  In this fashion, one can monitor customer’s trading activity and supply chain operations on a daily basis! This information can be incorporated into a web application which is distributed within the secure company intranet or protected proprietary web site.  An example is Panalpina, one of our  previous (CenTradeX) clients wherein we integrated their proprietary information into a customized web application for distribution to their regional sales offices.

U.S. Customs Data: Parsing & Normalization. The First Steps in its Long, Transformational Journey.

It took us several years (at CenTradeX) to develop an intelligent system by which to quickly and seamlessly assimilate the daily Customs feeds.

Over time we developed and incorporated automated procedures and administrated them under an umbrella control panel. Statistical data update processes from U.S. Census and U.N. Comtrade were initiated from this centralized control panel.  U.S. Customs data, initial processing and normalization as well as company, parent and location matching, were also conducted from the same control panel.

A detailed diagram of the individual components that make up the control panel (as a constituent part of the A.I.Engine) can be downloaded from Google Docs by clicking this link.

Component of the A.I. Engine. Control panel by which to initiate data import

Company data collections (from sundry vendors because each contained its own unique non-standardized characteristics) were initially processed utilizing different arrays of queries and procedures.  They were then integrated into the combined company repository which, in turn, were correlated with the U.S. Customs and statistical data.  See U.S. Customs Data Primer Part 4: Enlightenment Through Graphics & Diagrams for illustrative diagrams.

Data sources. Processing schedule

U.S. Customs Data that we referred to as AMS – automated manifest system – went through six distinct processes which are depicted below.  An illustrative diagram of all six processes and eight sequential databases (or collections) can be viewed by clicking this link.

Customs data is received and processed on a daily basis, but the final, resultant databases utilized to serve up web reports were refreshed weekly to allow for enhancements (beauty treatments) and interconnectivity with other data collections.

AMS (Customs) Data requires many steps of parsing, normalizing, refining, integrating and optimizing before it is ready for “prime time”.

Let’s look at the steps from the beginning.  Roughly speaking, the first task is to import all the data properly – correctly parsing all the elements contained in the original “flat file” and organizing them within a relational database.  Every data element and every permutation and aberration must be accounted for.  The diagram below depicts the second of seven databases (the first “DB” is really just a collection of the all the raw AMS or Customs data itself). This database is resultant and refreshed daily from the first processing step.

Parsed Customs data sorted and organized within a relational database structure. Click to open image in a new window.

A high(er) resolution depiction of the above diagram can be obtained from our Google Docs site, by clicking this link.

Next comes the “normalization” process, wherein each element of parsed data is refined and standardized. For instance, a simple Port code, whether foreign or domestic,  has its corresponding state, province /region, country and normalized name.  Each container code is translated into presentable information about its type such as refrigerated or non, height, length and particular identifying number. Within this normalization process company name, address, and contact iterations are resolved as well.

Below is a diagram depicting the third of eight databases after the second step along the Customs data transformation journey.  A high(er) resolution image is available for download from our Google Docs site.

The second step in the Customs data transformation process. Click to open in a new window.

The Holy Grail of Trade Intelligence: “Connecting the Dots” to U.S. Customs Data

The original CenTradeX Trade Intelligence platform was developed over a three to four-year time frame from Spring 2000 to Spring 2004.  Initially, we focused on bringing a host of value added features to statistical U.S. import and export data from U.S. Census.  Thereafter, we layered and connected this U.S. centric data with global import and export data (from U.N. Comtrade) on approximately 190 countries.  In time, we also incorporated U.S. state export data into the mix.  Surprisingly, no one had ever layered these data sets together before.

HTS (Harmonized Tariff System) Classification schema. The essential “language” of international trade statistics.

Atop these statistical data collections, we crafted a graphic, interactive interface wherein users after selecting an “X” product vector (one of the 6,000 Harmonized System product classifications) and a “Y” location vector (one of 200 countries listed) were (almost) instantaneously presented with dozens of dynamically created reports representing many perspectives pertaining to the intersection of their choice.  Over a billion unique reports could be potentially generated by the system. These included historic analyses spanning 20 years, trends 1, 3, 5 years into the future, contextual reports for the respective region and industry, competitive analyses, product/industry segmentation and trending, etc.  One economics /international trade professor remarked that the system made “data dance”.

Over a billion dynamically generated reports could be produced by the CenTradeX system

Market testing and resulting feedback, compelled us into the task of finding and incorporating company data – both foreign and domestic – toward the objective of uncovering the actual traders behind the statistics.  Economics and trending with numbers is one thing… pinpointing buyers and sellers is quite another.  Consequently, we apprehended and assimilated the best known company sources (at the time) including; Kompass, Harris Info, Hoovers, D&B, PIERS and others.

One of the most challenging aspects was that statistical data is organized under one (HTS) classification schema while company information is organized under other (unrelated) systems (SIC, NAICS, or a vender’s particular proprietary taxonomy). However, after successfully tackling that enigmatic brainteaser, we were able to incorporate other data sets, tariffs (for all countries and products), estimated shipping costs (from four U.S. port regions to any /all countries), foreign exchange as well as our clients’ proprietary data collections and others… with relative ease.

Connecting disparate data collections is challenging

By far the most arduous of our data transformation and enhancement endeavors was to understand, normalize and intelligently incorporate U.S. Customs data into this dynamic mix.  ALL OTHER purveyors of Customs data started the other way around.  Some (the best) have consequently connected some other data elements.  Most notable among the few is PIERS, who several years back contracted with D&B to “tag” their data collection of U.S. Importers & Exporters.  Datamyne is presumably undertaking a similar process now.  Panjiva has connected with reasonable success many other vendor’s data pertaining to foreign sourcing. To my knowledge, at the present, no others have made those necessary connections.  Zepol has begun offering statistical data, but not connected to Customs data. They remain in separate unrelated silos.  Suffice it to say, there are still significant vistas to explore and develop.

The 45 second video (slide show) below is an irreverent depiction of  what many users have reported experiencing when trying to find solid “trade intelligence” amidst the seemingly endless sea of  Customs data obscurity.

Why U.S. Customs Data is The “KING” Among All International Trade Data Sources

In the past we published a series of 5 articles; “U.S. Customs Data Primer”, Parts 1-5, about the particulars of understanding, processing and enhancing the daily transactional inbound shipping records published by DHS/Customs.  This article will expand upon the fourth article in that series, “U.S. Customs Data Primer Part 4: Enlightenment Through Graphics & Diagrams” which provides a visual guide for the processes we at CenTradeX employed in transforming raw data into trade intelligence.

The 90 second video (slide show) below portrays the original Trade Intelligence vision and mission that fueled the innovative growth and development of CenTradeX.

Developing innovative and powerful trade intelligence applications involves attending to three major areas: Access, Integration and Delivery.

  • Access: Helping the target audience(s) understand, find and use the data and application.
  • Integration: Normalizing of base data and connecting it with other relevant data sources.
  • Delivery: Enhancing the speed, efficacy and beauty with which the combined data is organized and presented.

For the purposes of this series, we will only focus on the second aspect, Integration.  Furthermore, having provided a foundation of understanding through the above referenced (linked) article, we will proceed to explore the more technical (under the hood) facets involved.

I refer to U.S. Customs waterborne import manifest data as the “base” data because it is considered (by myself and many others) the most intrinsically valuable, if challenging, international trade data set available.  It’s daily. It’s transactional. The U.S. is considered the easiest market to access. It contains a wealth of detailed information about the global supply chain. It represents $1 trillion dollars of trade a year.

U.S. Customs data is #1.  It’s THE KING of the international trade jungle.  However, a powerful Kingdom is more than just one regal personage.  It must include a capable entourage as well. Thus the need for complimentary data sets.

The first, primary step in building a powerful trade intelligence “kingdom” is attending to the King. PIERS has an easy to understand graphic portraying  the processes of normalizing “base” Customs data layer.

PIERS graphic portraying the processes involved in Normalizing Customs data

In performing the seven steps highlighted above, we at CenTradeX developed and refined many sophisticated procedures.  Over time, and through much scrutiny and evolution, we constructed a reliable, interconnected system of transforming data into intelligence.

  • It involved an array of automated queries and stored procedures for importing new data on a regular basis.
  • It involved created programs and “scripts” that would parse, tokenize and reference selected data elements, compare and contrast them with its expanding library and referential databases as well as “learn” better ways of matching and connecting.
  • It involved scouring the planet for the best, most reliable, accurate and timely ancillary databases to enhance and expand the KING and the Kingdom.

U.S. Customs Data Primer Part 5: Reference Guide Other Related WTD Articles

This week we went under the hood to look at nature and application of Customs data that tracks U.S. Waterborne Import Shipments from Overseas Suppliers and Sellers.

There are a number of previous articles wherein I have referred to other shortcomings and challenges inherent with the understanding and applying U.S. Customs data.  Please note the following:

We also published several dozen articles focusing on the current Trade Intelligence purveyors of Customs data.  The links provided below will pull up a handful of articles each – for a particular company, group of companies (in cases where they are “minor, second tier” providers) and summary evaluations. You can also find these articles, and others grouped by various categories, on the top navigation menu of this site.

Please refer to our Commercial Services Menu on the top navigational bar of this site for information on Application Licensing, Research & Writing ServicesDatabase Repositories, Artificial Intelligence Engine as well as other Consulting, Project Management and Application Development Services.

U.S. Customs Data Primer Part 4: Enlightenment Through Graphics & Diagrams

So now that we have addressed a few of the issues relating to understanding the inherent limitations contained within the U.S. Customs data, let’s look at the processes we (at CenTradeX) employed in parsing, normalizing and enhancing this data.  Every Trade Intelligence provider has their own approach to processing the data, along with their own particular brand of “spice” they add as well as the tools utilized to search through and display the data. Notwithstanding, the best few have many things in common.

Therefore I believe the following explanation may be both enlightening and educational, whether or not it is precisely mimicked.  First of all, let’s take a look at the “big picture”.  As reflected in the illustration below, Customs data contains detailed records – bills of lading – of the particulars of each and every transaction between foreign shipper and U.S. receiver for waterborne freight.

The Big Picture of Customs Data and the trade related information packed therein. 

As I’ve mentioned, Customs data is distributed (as a “flat file”) on a daily basis as the BOLs for various arriving “vessels” are cleared at the  respective U.S. ports.  The first, rather arduous process, is to “normalize” data into usable, organized elements contained in a relational database.

Normalization process for daily AMS Customs Flat files

We found that the best, most efficient method to add accuracy and value to Customs data, after the initial normalization process had been completed, was to connect it with our other comprehensive company and referential databases.  After going through many elaborate transitions, this enhanced customs data was ready for “show time”.

Diagram illustrating the processes involved in normalizing and transforming Customs data

We found that the more relevant ancillary databases we were able to connect to the Customs data, the more dimensionalized and powerful the individual portraits of trade and the underlying traders became and the broader the business applications and potential.

Connecting the Dots. Ancillary data sources to Customs data.

Trade Intelligence begins with data. It is the fundamental building block from which dynamic business applications are crafted.  To make delicious, even digestible Trade Intelligence you must adhere to some basic steps.

  1. Get good, accurate, timely data.
  2. Scrub it up, remove the “dirt”.
  3. Mix it with other good data.
  4. Cook it well.
  5. Serve it with style.

Deriving Digestible Trade Intelligence from Customs Data takes finesse and diligence

At CenTradeX, we worked with many clients to develop innovative International Trade Solutions, some of which incorporated daily transactional Customs data toward the objective of assisting their staff and customers to find and take advantage of global business opportunities both here and abroad.  Several of our applications, namely Prospects (and its hybrid interface for the financial community – Trade Finance) and Stats Plus were acquired and now marketed by UBM Global Trade /PIERS.

Prospects, Trade Finance & Stats Plus; T.I. Applications developed by CenTradeX, acquired now marketed by PIERS

Several other Trade Information providers have developed some very powerful and cool applications incorporating U.S. Customs data.  Check out the top navigational menu of this site under T.I. Providers Links> Transactional, Article Categories> Suppliers as well as Trade Blogs> T.I. Providers and Video Library> T.I. Providers for more information on these companies and the products and services they offer.

Also, please refer to our Commercial Services Menu on the top navigational bar of this site for information on Application Licensing, Research & Writing ServicesDatabase Repositories, Artificial Intelligence Engine as well as other Consulting, Project Management and Application Development Services.

U.S. Customs Data Primer Part 3: The Devil (or a Worthwhile Treasure) is in the Details

Let’s go back to the intrinsic nature of the U.S. Customs Data itself.

U.S. Customs data is gathered electronically through the AMS (Automatic Manifest System), for sea, air and rail. However, only waterborne manifests are available publicly.  Each daily tally contains detailed records of the tens of thousands of shipments that arrive at U.S. ports, many millions of shipments each year.  Since we are a country of consumers and most imports arrive via ship, U.S. Customs Waterborne Import data represents MOST of our trade activity… to the tune of $1 trillion annually.

In spite of its inherent shortcomings, pause to appreciate the fact that detailed records of virtually every waterborne shipment, every foreign seller, every corresponding U.S. buyer, every product and component, every carrier, every port, for every day is made available publicly.  The potential value contained therein is staggering.  Most countries (perhaps wisely) don’t publish such information.  In some countries releasing such information would be /is a capital offense.

First of all, U.S. Customs data comes as a “flat file”.  It is not conveniently delimited for easy assimilation.  For each and every Bill of Lading (BOL) the respective data fields have a reserved number of characters rigidly assigned; some fields are filled with interesting data, others remain completely empty.  Analogous to a train hauling rail cars of varying lengths, each BOL must have its fields carefully unloaded and organized.  One mishandled BOL field can wreak havoc with accurate assimilation and analysis of the data.

For instance, there is a very important single character field contained in the data string that signifies whether the proceeding data for the BOL is original and new or whether it represents a revision of an already processed BOL (from a previous day!).  I have seen cases where there are several dozen revisions published to a single record occurring over a span of several months!

If not accounted for, you have multiple (and inaccurate) shipments counts.  The difficulty is that the only way to adequately correct the problem is to go back into already processed and published data to completely erase and replace the previous record or retain the inaccuracy.

Another significant problem is that there are many times multiple containers for one BOL AND/OR multiple BOLs for one container (LCL –less than container loads).  If not prudently accounted for there will be huge discrepancies when calculating TEUs (the standard measure for shipment volumes).

Implications?  If you are evaluating whether or not to construct a new distribution center, expand port capacity, open up a freight forwarding office, evaluate economic development, perform competitive analyses on a particular U.S. buyer or foreign seller, or deconstruct and improve supply chain logistics, what you don’t know or what you think you know (but is really fictitious) can kill you.

U.S. Customs Data Primer Part 2: “Holes” in the Data & Other Frustrating Anomalies

Once you know where the holes are, many times you can fill some of them. U.S. Census (statistical) data – which is published on a monthly basis – can give you an accurate measure of the value, number of units (and thus by computation the average cost per unit), country of origin and U.S. port for a particular product grouping (arranged within the Harmonized Tariff System) and method of transport (air, water, and again by computation “other” which would typically be rail /truck from Canada or Mexico).

Unfortunately, U.S. Customs data and U.S. Census data are asynchronous in many important ways.  For reasons beyond the scope of this article, it is impossible to take a record of all waterborne shipments for the month of January from U.S. Customs and seamlessly overlay it with the aggregate statistical record of imports provided by U.S. Census.  Further, the HS product categorization system is many times either too specific or too broad to apply.

Another problem is that several thousand U.S. importers and their corresponding foreign suppliers have been “suppressed” from appearing in the U.S. Customs data publications through the “trade secrets” exclusion to the Freedom of Information Act.  This “suppression” results in about 1/7 of all shipment records having blank fields where the “foreign shipper” and “U.S. importer” identification would normally be.

Again, once you know where the holes are, there are ways to work around them.  Wal-Mart is an obvious entity that attempts to mask its supply chain activities and valuable suppliers.  Notwithstanding, in a landmark report done on the “tainted toy” fiasco several years ago, we were able to extract 40,000 imported shipments of toys by Wal-Mart (of the 400,000 we retrieved) over an 18 month period of time.

How? Several methods. Although presumably “suppressed”, tens of thousands of transactions slip through the filtering methods applied by U.S. Customs technologies. Port pairs (matching foreign port with U.S. port) for a particular product also yield significant dividends.  The “product description” and “marks and numbers” fields contained on the shipping manifest sometimes contain references to either Wal-Mart or one of its known suppliers.  Product identification information – SKUs, trademarks, etc.- are also sometimes found.

It’s all a matter of sleuthing: trying to put together a complex puzzle from the resources at hand.  In the end, it’s an imperfect world with incomplete data.  However, with some effort, technological tools, multiple data sources along with intelligence and knowledge, you can discover an amazing amount of very valuable trade /business intelligence.  You just need to increase your awareness in order to align your expectations to what’s real and possible.

U.S. Customs Data Primer Part 1: You Can’t Always Get What You Want… BUT

Most people seem to want what they don’t have.  I guess it’s human nature.  It’s that way with trade intelligence. Folks want to extract more information from it than what is intrinsically possible.  You just can’t get soda pop from milking a cow.

You aren’t going to get a complete picture of importers, supply chain, current inventory, shipment valuations and intrastate transport patterns from U.S. Customs data.  However, just because you can’t have everything, doesn’t mean you can’t get a lot.  As Mick Jagger sang, “You can’t always get what you want, but if you try sometime, you may find, you get what you need.”

To understand what you can and can’t get from U.S. Customs data… we need to dig into what it is and why it is… what it has and what it doesn’t have… what current T.I. providers are doing to enhance the base data…  where the holes are and how best to fill them.

U.S. Customs, now under the auspices of Homeland Security, requires detailed documentation of all waterborne shipments entering into the United States.  This information must be filed 24 hours before the shipment disembarks from its originating foreign port.

Once the carriers dock at their respective domestic port, each day’s documentation of shipments (midnight cut off point) is published and distributed via FTP (used to be sent via overnight on a DVD) to awaiting subscribers (of which there are only a couple handfuls). This is made available through the Freedom of Information Act.

First of all, with the exception of UBM Global Trade /PIERS, who has special reciprocal information exchange deals with many ports and carriers as well as a cadre of data gathers assigned to many U.S. ports, only daily transactional data on U.S. IMPORTS is available, NOT EXPORTS.  Thus, as a U.S. manufacturer, you aren’t going to find a list of foreign buyers for your particular product within the confines of U.S. Customs data.

Secondly, only commodities and products that enter the U.S. via SHIP (waterborne freight both containerized and non-containerized) are accounted for.  Shipments that come via TRUCK or RAIL, let’s say from our North American neighbors – Canada or Mexico – are invisible.  Also absent are shipments that come via AIR.

Therefore, if you’re looking for U.S. Customs data to provide information on shipments of high-tech components, you’re going to be very disappointed (because they are mostly shipped by air).  If you want competitive intelligence on a company who largely imports from suppliers in Mexico, again, you’re going to become very frustrated.  If you’re after an accurate analysis of all foreign suppliers and U.S. importers for a particular component that may have originated from several countries (including our NAFTA neighbors) and shipped by multiple means (air, rail, truck, ship), it just isn’t going to happen.  There are going to be huge, gaping holes in your report.

Available U.S. Customs data ONLY details inbound waterborne shipments, NOT U.S. exports and NOT trade activity by rail, truck or air.

Understanding Data: Normalization Procedures with U.S. Customs Data

PIERS has an excellent graphic that explains the process by which a reputable TI Provider should handle the U.S. Customs Waterborne Import Manifest (bill of lading) data. Many other TI providers, particularly the newcomers to the market, jump from collection to publishing. After all, if you eliminate cleansing, standardization, verification, validation and enhancing you’re bound to save time and money.  That’s why there are some data providers offering access to the U.S. Customs data for as little as $30.10 per month.

During my tenure as founder/CEO at CenTradeX, we worked very hard to make sense of data, connect it in innovative ways and provide easy access and graphic delivery.  We had a lot of smart and creative people working a long time toward those objectives. We spent over two years working out the bugs before the first interface integrating and utilizing the U.S. Customs data could be launched.  It’s complex and obtuse.  It’s also perhaps the most valuable single source of Trade Data available.

The inherent treasures buried within the data have only begun to be unearthed. PIERS has gotten the furthest, particularly with their acquisition of key CenTradeX applications and technologists, but even they have a long, long way to go.  It is my hope that beyond succumbing to the recent and base marketplace inertia that has led to the commoditization and devaluation of the data, that necessary capital and creativity will be applied to the task of furthering innovation in the trade intelligence field by those with the vision and resources enough to carry it further which may or may not be PIERS.

The following illustrations address one of the necessary aspects in the normalization, integration and enhancement processes involved with U.S. Customs Data. The first two diagrams (which can be clicked upon to display full size) relate to the identification, normalization and enhancement of the Foreign Shipper and U.S. Importer of record.

The respective U.S. Customs data fields containing Shipper and Importer names and addresses need to be normalized (including standardization of the many iterations of those names) and broken down into separate “tokens” such as zip code, state, phone, city, etc. These tokens are matched against a refined and dependable company data repository derived from third-party sources (we used Hoovers, D&B, Kompass, PIERS and others) as well as the perfected names collected over time from the Manifests themselves. The number of tokens matched are then scored on a reliability or veracity scale that is internally developed.

The same procedure can be utilized to normalize and match several “silos” of disparate data held between different companies or divisions of the same company such as marketing, operations and finance.  CenTradeX was once consulted by Maersk for a project in which they wanted to normalize, standardize and match their own internal company information, after which time they then could connect it to the individual shipment manifests for themselves or their competitors.

TI Transformation: Data into Information into Knowledge into Intelligence into Application

The commoditization (and devaluing) of trade data and trade data based products is accelerating, perhaps inversely proportional to the quality and number of suppliers and products in the marketplace.

Field of Wheat

Let’s look at one of the most treasured (and perhaps useful) types of trade information – U.S. Customs Waterborne Import Manifest Data.  Automated Manifest System data, sometimes called “AMS”, is information collected daily by DHS (Department of Homeland Security) U.S. Customs for and about each and every ship and shipment bound for the U.S.. Thousands of imported shipments are logged everyday. Each manifest contains information about the foreign shipper (exporter), the receiver (importer), details on the product shipped and various logistical specifics on routing.

Harvested Grain

“Back in the Day” when I founded CenTradeX in the summer of 2000, only PIERS offered such information. Primarily, it was distributed to customers via a stack of CDs each and every month. The user hunted through an Excel type interface for specifics on a particular product, shipper or importer. For elite customers, PIERS offered a plethora of prepared reports. (One of their customers once showed me a CD containing over 57,000 such reports). Slowly, PIERS converted such customers to an on-line system which also served to reduce rampant piracy.

Refined Flour

In the last 5 years, available technological resources have grown exponentially. Correspondingly, vendors offering access to and products based upon the U.S. Customs Waterborne Manifest Data have proliferated like bunnies. On the one hand, this has led to increased competitive pressures which have driven innovation forward, quality upward and prices downward. On the other hand, there is a widening gap between data and intelligence.

I stayed awake until 2:00 a.m. one evening recently, trying to catch up on all (that I could find) of the NEW vendors offering AMS data… WHEW! The data has gotten incredibly cheap. The quality of companies /products are mixed. Some look like they are solo operations run out of someone’s garage. Others are incredibly slick.

Loaf of Bread

The first competitor (not including my  company CenTradeX) on the scene was Zepol. Started by two young fellows out of Minnesota – they had a simple business plan – improve on PIERS’ search utility and undercut PIERS’ price by 20%. Slowly they made headway. They were followed by Datamyne, Import Genius and Panjiva, which appeared in the last several years. Now, add to the list Manifest Journals, Cybex, Info Drive India, IE Intelligence, Trade Intelligency, Data Trade, Trade Mining, Import Intel, TradeKey, Vanguard…

Sometimes quality doesn’t have a price tag.

At the peak of the pyramid in price and quality /value is PIERS (of course) – a handful of “G Notes” will buy you the best. Second tier providers include Datamyne, Zepol, Import Genius and Panjiva whose prices range from a few Benjamins to a couple of Clevelands (the President on the now defunct $1,000 bill). Cascading down the food chain are the bottom feeders, like Manifest Journals, Cybex, Info Drive India and the others, which offer access to AMS data for as low as $30.10 a month.

The current or prospective user of “trade intelligence” products and resources must decide upon which “values” he values most. It’s cheaper to harvest the wheat yourself… you can purchase the raw AMS data directly from U.S Customs for $100 per day. Maybe even start your own company. Heck, the last three CEOs of PIERS did just that.  They either run or have founded companies in the list above.

*This post was originally published during the first week of May, 2011.  

U.S. Customs (AMS) Waterborne Shipping Manifest (BOL) Import Data

TI providers such as PIERS, Datamyne, Zepol, Import Genius, Panjiva (and a growing host of others) depend upon BOL data as the basis of their product offering. They all receive base shipping documents from U.S. Customs/DHS (Department of Homeland Security) via direct FTP feed or delivered on daily DVDs. Each follows their particular processes of collection, cleansing, standardization, verification, validation, enhancement and publication.

There are many names used to refer to this data. Among the identifiers I’ve heard (and used) alone or in combination are: AMS, BOL, Customs, Waterborne, Manifest, Shipping, Import, and others. Let’s clarify. Not less than 24 hours prior to an inbound shipment from a foreign port to the United States, a handful of documents must be filed with the U.S. Government (Customs and Border Protection), namely bill(s) of lading. By and large, these BOLS are filed electronically utilizing the AMS (Automated Manifest System).

For the purposes of this article we are limiting our discussion to U.S. waterborne (by sea) imports. There are also air and rail AMS as well as documents collected for trans-border shipments via truck to/from Canada or Mexico. Also excluded is transactional data collected on U.S. (waterborne) EXPORT shipments. At this juncture only PIERS, with an on-the-ground staff of reporters stationed at over 80 U.S. ports is able to collect and disseminate such information.

15 million Bills of Lading are Collected Annually Thru SEA AMS

Thus, we address the roughly 15,000,000 annual waterborne (cargo carried by ship across a deep blue sea) shipments (BOLS) filed and collected via the AMS (computer) system by our friendly Customs officials, gathered and disseminated by your neighborhood TI provider via their particular product /interface (or purchased directly from DHS/Customs @$100 per day).

It’s all the same base data. There are different refinement processes and “value added” flavors added to the stew. The data is served in a plethora of fashions. Some TI providers dish up the data (or offer it on a self-serve basis) on paper plates while others have a cadre of courteous, well trained, superbly tailored trade experts to serve you.

There are only two dozen fields of data on any particular bill of lading that can be made publicly available, through the Freedom of Information Act. No TI provider or entity, despite size or age, has the right to more or less data. The basic available data elements are:  (* signifies that the information is sometimes but not always listed):

  • Consignee: (Name, Address, *Phone, *Email) Essentially the U.S.importer or buyer.
  • Shipper: (Name, Address, *Phone, *Email) Essentially the foreign exporter or seller.
  • Notify Party: (Who gets to know when the shipment arrives. There can be multiple “notifies”.)
  • Product Description: (Sometimes extremely detailed including 10 digit HS identifiers, invoice #, product #, etc.)
  • Marks and Numbers: (Notations on the boxes or containers, trademarks, product identifiers, etc.)
  • Port Info: (Foreign, U.S. along with transfer points, plus a couple of other items.)
  • Shipment detail: (BOL #, TEUs, weight, quantity, measurement, container #, container type.)
  • Carrier detail: (Ship name, ship code, voyage #, Carrier /sub-carrier.)
  • Misc: (A couple of other minor fields I won’t bother to mention.)

Call it what you want. The data reflects daily trading activity for U.S. import shipments by sea… end of line… end of story.

Part Three :The ABCs of U.S. Customs Data- Issues & Shortcomings

Buyers beware. Users of U.S. Customs Waterborne Import Manifest (Bill of Lading) data need to be aware of the major shortcomings & pitfalls.  Part 3 of 3.

In addition to the plethora of potential iterations for each U.S. importer and corresponding foreign shipper identified on the shipping manifests, there are other significant problems.

There exists the Master versus House Bill of Lading (BOL) issue, which leads to many duplicate container counts.  The same shipment may appear under both filings.  Unless the TI provider has developed the technology to address this issue, accurate container counts for both shipper and importer will be impossible.

There can be dozens of duplicate bills of lading contained within the data

Further, there may be numerous – sometimes dozens – of revisions made to a particular bill of lading. These revisions may be published days or months subsequent to the original filing.  Unless said roadside TI provider makes provision for ongoing corrections by going back and deleting all previous entries for a particular shipment whenever a new revision shows up, transactional profiles will be greatly skewed.

Many bills of lading contain multiple containers.  Some containers contain multiple shipments that have been aggregated together.  There are many types and sizes of containers. A 40 foot container is 2 TEUS (Twenty Foot Equivalent Unit) and a 20 foot container is 1 TEUS. A 45 foot container is 2.25 TEUS.  There are many other variations.  Therefore, the number of containers alone is not a dependable measurement. Neither is shipment count.  One shipment may contain 20 containers or may represent 1/5th of one container.

Although it is against the law, and more stringent measures have been employed since 9/11, many times the real shipper and importer of record do not appear on the BOL. Instead, the Freight Forwarder, NVOCC or some trade agency (middleman) may be listed.

A little sleuthing can reveal many secrets

Several thousand U.S. importers have petitioned CBP to have their identities suppressed on the publicly distributed BOLs under the trade secret provisions of the FOI (Freedom of Information) act.  Around 14% of all BOLS (millions of shipments a year) are thus suppressed.  It has been called “The Walmart Effect”.

In addition, some U.S. Importers and Foreign Suppliers seek to hide their identities by providing the required identity information but listing it within the product or trademarks area of the BOL instead of the name fields.

Despite legitimate tactics of name suppression or other more dubious methods employed to conceal the details involved in one’s trade activity, with a little clever sleuthing much can be revealed.

For the tainted toy study we conducted several years ago, we were able to identify over 40,000 toy related shipments over an 18 month period by Wal-Mart alone, despite their obvious efforts to mask their import activity.

Trade Intelligence is a lot more than data and a search/reporting tool.

Part Two :The ABCs of U.S. Customs Data- Issues & Shortcomings

Buyers beware. Users of U.S. Customs Waterborne Import Manifest (Bill of Lading) data need to be aware of the major shortcomings & pitfalls.  Part 2 of 3.

Although shipping manifests contain valuable information about the trade transactions of the U.S. Importer and Foreign Supplier, they can’t and don’t provide a complete picture.  One of the most predominant shortcomings is that they only document U.S. Waterborne Imports: products and commodities transported by ship, not air, not rail, not truck, not camel.

Therefore Canadian and Mexican cross border trade is all but invisible. Export transactions are not listed.  (Re-exports in some cases are).  Air freight shipments are not available.  And although a majority (around two-thirds) of the products we, in America, import from overseas comes to us by ship, an important minority don’t.  Particularly high value, just-in-time, perishable and fragile components or merchandise are not transported by water.

U.S. Customs BOL Data ONLY looks at Waterborne Imports

So, U.S. Customs Waterborne (BOL) import data is not a great place to look for foreign suppliers of such things.

Furthermore, specific product identification is not easily uncovered within the data.  Yes, sometimes the respective Harmonized Tariff codes will be buried within the product or trademark fields on the BOL, if your particular trade provider has developed the algorithms to accurately parse an HS code from among other numerical data such as invoice numbers, quantities, phone numbers, addresses, reference numbers, etc.

Many times there are no specifics.  Toys, Furniture or Glassware may be the full extent of the product description. Other times there may be extensive descriptions including 10 digit HS codes, trademarks and even SKU numbers.  There is no uniformity.  Your typical TI provider does little to help in this regard. Most simply offer a blunt search tool which plods through the millions of products descriptions contained on individual BOLs.

In order to put an estimated price tag on a shipment, you must first know the specifics of the product, down to at least a 6, preferably 10-digit Harmonized code. Then, using statistical information you can roughly infer an estimated value which is a very crude measuring stick.

Accurate Product Identification is Vital Making Use of the Customs Data

PIERS, who has been at all this the longest, is the only company I know of that has even attempted to attribute an estimated shipment value.  To do such, they first had to assign a specific product identifier (much of this is still done by hand) to each BOL and then attach a gross average based upon aggregated statistical data from U.S. Census.

Many times even the resulting calculations have been flawed.  However, as we say in the data world: “Bad breath is better than no breath at all”.

Note: For those who want to give shipment valuation a spin, greater accuracy can be achieved by disaggregation of the statistic (overlying) data by its respective (foreign & U.S.) port and foreign (source) country.

Part One: The ABCs of U.S. Customs Data- Issues & Shortcomings

Buyers Beware! Roadside TI Vendors May be Selling you a Pig in a Poke

Users of trade intelligence, in particular U.S. Customs Waterborne Import Manifest (bill of lading) data, need to be aware of the major shortcomings and pitfalls.  It’s important to learn the ABCs of the data.

With the recent proliferation of TI Providers offering access to Customs data via off-the-shelf BI software packages – some with subscription plans costing less than 99 cents a day – the veracity of the resulting reports needs to be seriously considered.  If a company saves a few thousand dollars by buying cheap data from a roadside TI vendor, and thereafter depends upon errant reports to base million-dollar global trade decisions, what is profited?

ABC company, which one does a BOL refer to?

Yes, the base data comes from the same source: DHS/CBP (Department of Homeland Security /Customs and Border Protection).  However, a veteran TI Provider, namely UBM Global Trade /PIERS, being the most reputable in the field, has invested significant resources over decades in various refinement and value added processes that help ensure quality, dependability and usability.  Whether these value added enhancements justify the pricing differentials involved is a matter for the market to ultimately decide.

What are some of the pitfalls?  Let’s take a simple hypothetical example: How many containers did ABC, inc. (American Business Corporation) import last from foreign supplier DCF, ltd. (Decent Chinese Factory)?  The answer is theoretically contained within the U.S. Customs data.

ABC. Broadcasting or Carpet care?

The first, most basic problem is that the names are not standardized on the bills of lading.  Consequently, there are literally dozens upon dozens of iterations for each importer name:

  • Amer. Bus. Corp.
  • American Business Corporation, Inc.
  • American Business Corp.
  • American Business Corporation, Incorporated,
  • ABC, Inc.

the ABCS of TI

Multiplied by the variances in naming conventions appearing for the corresponding trade partner:

  • Decent Chinese Factory, ltd.
  • DCF, Inc.
  • DCF, limited
  • Decent Chinese Factory

Depending upon the parsing and refining algorithms employed by the respective TI Provider (if any), there are also matters of matching location.  There may be 10 or more “ABC” corporations in the U.S.

  • American Business Corporation
  • Advanced Banana Clockworks
  • Aerospace Ballistics Controls
  • Atlanta Baseball Company

The nameless importer

Then there are matters of several divisions or locations with divergent business operations under the same conglomerate name.  Sometimes an NVOCC may be listed as the importer of record which is a violation of law that occurs thousands of times a day. Some 14% of importer names are suppressed and thereby appear blank on the BOLs.  The list goes on.  Yet, naming conventions are one of the relatively easiest problems to address!

Trade Intelligence or TI: IT all depends upon how you define “IT” and “TI”.

Many companies label themselves as trade INTELLIGENCE providers in some fashion or form.  Each and every organization, foreign or domestic, that I list below utilizes the DHS/U.S. Customs Waterborne Import Manifest (bill of lading) data as THE primary basis for their trade intelligence.

First, there is the 500-pound gorilla and industry leader, PIERS: “The STANDARD in Trade Intelligence” with products such as PIERS TI and a handful of others. They’re the standard. The (big) measuring stick.

Occupying the next rung on the food chain, you’ve got:

  • Datamyne: “The best VALUE in Business Intelligence”
  • Zepol: “Global intelligence that MOVES your BUSINESS” – with products like Trade IQ
  • Import Genius: “A LEADING provider of intelligence”
  • Panjiva: “THE leading intelligence PLATFORM”

These suppliers boast of value, motion, leadership and having a firm footing when it comes to their brand of trade intelligence.

Todays bottom feeder, tomorrows industry leader

Lastly, let’s look at the folks I label “bottom feeders” (which upon reflection I should rename in a more complementary or at least neutral fashion because, who knows, one of them could be my next business partner) who have their particular shtick.

  • ImportIntel: “CUSTOM intelligence reports”
  • Trade Data Channel: “US IMPORTS Trade Intelligence”
  • Manifest Journal: Call Michael to discuss your “TRADE Intelligence IQ”
  • Cybex: “ONE STOP SHOP for research and business intelligence”
  • InfoDrive India: “Export Import Business Intelligence Information in the most USER FRIENDLY & cost-effective (CHEAP) manner”
  • Trade Intelligency: “A TOP provider of Trade Intelligence”
  • IE Intelligence: “World INTEGRATED import export intelligence solutions”
  • IBIS Trade Intelligence: “METALS, CHEMICAL, PLASTIC (and other industry) trade intelligence”
  • Trade Mining: “QUICK Business Intelligence for Trade”
  • OTHER “bottom feeders” that somehow slipped out of my (inter) NET (search).

Yup, you can drop in on one of the “top”, “user-friendly”, “one-stop-(TI)shops”  for some “quick”, “cheap”, “US import” trade intelligence either for a specific industry like metals, chemicals or plastics or get it “integrated” (all together) or “custom made” (sliced and diced how you like it).  Or, you can just call Michael at Manifest Journal to discuss your Trade IQ personally.  Maybe he’ll give you a trade IQ test.  You may even qualify for Trade Mensa.

So, back to Trade Intelligence, what is it?

Will the proliferation of Trade Intelligence Providers add to overall customer happiness and successful business application?

On one level it is simply DATA with a little tech magic:

  • Presumably sifted like grain (to remove the rocks, sticks and foreign objects).  Everyone gets said “grain” from the same farmer, CBP.
  • Prudently classified, sorted and deposited into neat little silos (database objects called tables) hopefully guided by geeks with know how.
  • Craftily shaked, baked and served up using whatever “off-the-shelf” or custom-made business intelligence software and graphics reporting solution that is “tech du jour”.

Since the same data is publicly obtainable and relatively cheap and with sophisticated BI software packages within easy reach, definitions and expectations of what constitutes Trade Intelligence is likely to change…rapidly.

The Use and Application of Trade Intelligence Can Be a Matter of Life and Death

In 2007, in association with ECRM, Walmart and the Arkansas World Trade Center, CenTradeX undertook a massive study into the issue of tainted toy shipments imported into the United States from China. Widespread reports of acute sickness and death among children were causing panic.

Thorough analyses of the media coverage about the event revealed very little of substance.  A handful of major U.S. merchandisers were mentioned, mainly Mattel, as well as a couple of Chinese factories. That was it, period.

The issue of securing and maintaining product quality and safety is of vital significance whether we’re addressing cars, or toys or food.  It takes on deepening levels of complexity and importance when manufacturers and retailers rely on foreign sources. The underlying problems have to do with transparency, dependability and accountability.

U.S. retailers import trillions of dollars of merchandise from thousands of overseas factories. A vast majority of these foreign made products are carried to us on cargo ships, offloaded at coastal ports, processed through U.S. Customs, and distributed through our local merchants.

Who in this supply chain is responsible if something bad happens such as young children dying from ingesting paint off toys laden with toxic levels of lead? During the tainted toy fiasco initially China was blamed.  Then the CEO of Mattel stepped forward to take the fall. Media ultimately pointed to lax governmental regulations. Purportedly, several Chinese factory executives were summarily executed.  Who’s responsible?

Several years prior, an agricultural infestation spread from a Florida port causing billions of dollars of damage before it could be contained.  It took months to isolate the source and affect a remedy. The culprit ended up being diseased wooden pallets carrying imported consumables shipped by our Latin American neighbors. An early warning system may have mitigated the extent of the disaster.

By analyzing the daily Waterborne Import Shipment Data from U.S. Customs, China transactional import- export data, as well as other statistical and company databases, we (CenTradeX) analyzed over 400,000 toy shipments from China into the U.S. by 4,000 retailers spanning an 18-month period.

Through the data we discovered that the same Chinese factories that were pedaling tainted toys were also exporting other merchandise such as personal care items, household supplies and furniture.  Did these products contain dangerous levels of lead paint?  Did said factories continue to sell toys and other merchandise to their less regulated Asian neighbors?

Many of the answers can be found in publicly available data if one bothers to look.  Definitely some sleuthing is required.  Notwithstanding, look at the stakes involved.

At some point it may not just be a matter of tainted toys, radiated food or diseased pallets. Perhaps it will be a misclassified container of toxic chemicals or dangerous substances that makes its way past CBP and ends up causing a catastrophe in one of our major metropolitan areas.

Whether for matters of homeland security, food safety or quality assurance, much more can be done with trade intelligence to help us secure our health and our homeland.

U.S. Customs Waterborne Import Data: Perspective is Everything

Yes, creativity and technology are slowly being applied to the world of world trade data.  Back in the day, the only way you could fetch a good look at shipment manifests was by pouring through a stack of CDs each month and viewing row after row of roughly hewn data. Heck, that was only 10 years ago.  CenTradeX was the first to merge, integrate, marry, correlate disparate data systems together.  In addition, it was the pioneer in developing easy to use graphic  interfaces and powerful visual reports.

When I began in the Trade Intelligence field in the year 2000, it was incredulous to me that NO one had ever layered global statistics with U.S. statistics.  It seemed like a no brainer to me.  We kept adding layer upon layer (of statistical, referential and company data) in order to develop a more 3-D versus flat perspective of international trade.  Another remarkable thing to me was how misunderstood and undervalued trade data was and to a large extent still is.

Trade Intelligence is at the helm of the wheel that navigates $12 trillion dollar worth of imports and exports annually.  It would stand to reason that more people would respect and want to unleash the full power and potential contained within the kernels of trade data. Many new providers are coming on the scene touting cheaper and cheaper access to the U.S. Customs Waterborne Import Data (see last weeks article “TI Transformation; Data into Information into Knowledge into Intelligence into Application“).

Most offer very rudimentary search utilities to access and manipulate the same data.  This proliferation and commoditization of the U.S. Customs data only reinforces the         common marketplace perception of its uselessness and worthlessness. Yet, when used creatively and skillfully, in combination with other data sources can make huge impact to the bottom line.  Overall, competition is good.  Many times it forces positive innovation and change.  However, it can also kill it.

Many years ago, during the reign of the “old guard” at PIERS, we (CenTradeX) met with the executive team in order to discuss joint venture opportunities.  After thorough review of our CenTradeX applications – which dynamically integrated many global, U.S., State Statistical sources as well as Global, U.S. and China company information databases, it was remarked, “All that would be a good SUB-SET to our U.S. Waternorne Import Data”.  Suffice to say that a deal was not forthcoming.

The “new guard” at PIERS is hemispheres away from that way of thinking.  John Day, CEO UBM Global Trade and Gavin Carter, CIO UBM and at the helm of PIERS, are sharp, forward thinking, global minded trade intelligence professionals.  Since the recent installment of their new management team, complete overhaul of their legacy IT systems, and acquisition of innovative trade intelligence applications from CenTradeX in 201o, they are a force to be reckoned with.

Notwithstanding, they and other reputable Trade Intelligence providers are under pressure from the bottom-fishers who are only interested in turning data into quick dollars at the expense of long term viability.  Such forces will only serve to stagnate growth and development, IMHO.