CorpWatch API FAQ
Where does this information come from?
Every corporation listed on a U.S. stock exchange is required to file annual reports with the SEC. These reports—called “10-K forms”—are intended to help investors and regulators better understand the risks, assets, and current leadership of a company in order to make informed investment decisions. One of the requirements of a 10-K filing is that it must include a section that lists any other companies that are owned by the corporation making the filing. This “Exhibit 21” provides a rough snapshot of the ownership hierarchy of the subsidiary “child” companies along with the location or jurisdiction of incorporation of each subsidiary. Thanks to the efforts of various government transparency advocates, the SEC makes all of this filing information available on the web for free.
What does the CorpWatch API do that the SEC doesn’t?
Although the SEC provides a search interface for locating company filings (EDGAR / IDEA), the subsidiary information is not presented in a standardized format suitable for automated use or insertion into a database. The CorpWatch API uses parsers to “scrape” the subsidiary relationship information from Exhibit 21 of the 10-K filings and provides a well-structured interface for programs to query and process the subsidiary data.
What data do you provide, exactly?
Using the API you can access:
- Hundreds of thousands of formal names of public corporations
- A consistent ID system for companies
- Locations associated with companies (UN country / state or province)
- subsidiary / parent relationships between companies
- alternate and former names for companies
- a link to the SEC filing
In addition, for most filing companies we include
- the IRS tax ID number
- business and/or mailing addresses
- an industry classification (SIC) code
- the SEC company ID (CIK)
All for the low cost of … FREE!
What is an "API" anyway? How do I use it?
"API" stands for Application Programming Interface. An API is basically a way for software applications to communicate with each other and share data. To use the API directly, you need to be able to write simple software that can talk to the CorpWatch API service. For more information about how the API works, please consult the API Method Documentation. If you don't know how to write software, you can still access most of the data using the CrocTail site.
How do you know where a company is located?
When companies file with the SEC they must provide a business address and jurisdiction of incorporation (the state or country were they are legally registered). Exhibit 21 of a parent company's 10-K filing gives the name of jurisdiction for each subsidiary company, but in some cases it also gives the location where the company actually does business. (For example, a company could be incorporated in Delaware, but actually operate in California). Unfortunately, the parser is not able to distinguish between the two types of locations, so the associated location could be either one. We also standardize various ways of spelling or indicating nationalities and translate them to a standard UN code. (For example “a French company” = “France” = “FR”) We also only geo-coded to the country and state/province level, so the map markers are not shown at the street addresses of the companies.
How timely are the data?
Currently the application is using information from 2008 filings, since that is the most recent complete year. Corporations are only required to file once per year, and they don't all file on the same dates. When the CorpWatch API is completed, CrocTail will show the most recent information availible from the SEC for each company.
How accurate are the data?
The Exhibit 21 flings are not submitted in a standardized format. This makes parsing them very difficult and in some cases impossible. Since we are dealing with many millions of records, it is not possible to check that each one is correct. There are definitely some errors, but we’ve checked some small samples and we believe we are parsing 90% of the subsidiary companies correctly. Thus, you should always check back with the original official filings at the SEC before making any decisions using this data. (Each profile includes links to the original forms.) Of course, companies can make their own errors or omissions in their filings as well.
How do you know you have all the subsidiaries, joint ventures, partnerships, etc. for each company?
We don’t. We only have the information that the companies themselves submitted to the SEC in their 10-K filings. There are a number of loopholes in the filing requirements, and companies often do a good job of hiding the ownership relations to prevent competitors or regulators from knowing what they are doing. Also, in some situations the legal and financial arrangements can be so complex that it is difficult to say whether or not one company “owns” another.
Why do some large companies only have one level of subsidiaries?
Some companies specify in their filings which of their subsidiaries own each other, but not all do. Furthermore, in some cases we are able to parse out the full list of “child” companies but are unable to correctly extract the hierarchy from the filing. Unfortunately, as the information is currently filed, there is no easy way to distinguish more complicated structures from simpler hierarchies where, for example, a single parent company owns a bunch of subsidiaries directly. It is our hope that the SEC will consider amending its guidance for filers in the future so that it is easier to extract and replicate the hierarchy from the filing.
Why do some corporations have multiple names?
There are a number of reasons this can happen. In the simplest case, a corporation may have many similarly-named companies as it subsidiaries. Some larger firms may be made up of several separate “trees” of corporations that are, from a legal standpoint, independent corporations even though they may share a board of directors or other means of control. In many cases it is possible to determine how these complex corporations are related by reading the the full 10-K, but the information is not present in the Exhibit 21 we are parsing.
Why can't I find Company X ?
This database mostly covers publicly traded corporations in the U.S. Not very many foreign companies appear in the SEC data, but the foreign subsidiaries of U.S. corporations are included. In addition, some of the largest companies in the U.S. are privately held. This means they don't usually file with the SEC and are not requried to make any information availible to the public.
OK, really, can you give me a more detailed breakdown of the scope of the database?
Basic stats for June 2009 version of API:
- Number of 2008 companies with some kind of filing (meta info): 72,171
- Number of company 10-K filings w/ sec21 (subsidiary info): 4,702
- Filings with relationships parsed: 4,281
- Percentage of filings parsed: 91
- Filings with empty sec21 (nothing to parse): 31
- Percentage of true filings parsed 92
- Number of subsidiary relations in raw filings: 168,344
- Percentage of relationships w/ location: 100
- Total number of companies (after cleaning):214,933
- Number of company relationships (after cleaning): 157,655
- Companies w/ zero relationships (no children and no parents):64,536
- Top-level companies (no children): 68,479
- Top-level companies with children: 3,943
- Number of companies with more than one level of children: 509
- Top 5 locations associated with companies, number if associations :
- Delaware: ~45,400 (17%)
- New York: ~20,700
- California: ~17,700
- Texas: ~10,500
- Illinois: ~9,500
What are the main problems with this version of the API?
- Still not distinguishing "doing business as" relations well, although parser somewhat cleaner than before.
- Locations are a mix of addresses and jurisdictions of incorporation
- Need system for entering hierarchy info not coded in section 21 (there are three "top level" companies for Coca-Cola, but two are controlled by the 3rd)
Can you give me a better idea how the parsing process actually works?Sure. This text file gives an outline of the parsing sequence and scripts involved. And you can even download the source code here if you want to inspect all the messy details.
How can I find out more about this project?
Email us at: firstname.lastname@example.org
Or contact CorpWatch at: 2958 24th Street | San Francisco, CA | 94110, USA | Tel: +1-415-641-1633