Accessibility Information

Users of assistive technologies such as screen readers should use the following link to activate Accessibility Mode before continuing: Learn more and Activate accessibility mode.

Crowd Sourcing

Solicitation Number: 20120047
Agency: Library of Congress
Office: Contracts Services
Location: Contracts Services
  • Print
Special Notice
Added: Apr 30, 2012 11:26 am
This is a Request for Information (RFI). The U.S. Copyright Office, a department of the Library of Congress, has initiated a project to digitize and make available online the historical records of copyrights dating from 1870 to 1977. The Library expects to issue a future request for proposals the goal of which would be to select one or more organizations with the skills, experience and equipment to support the capture of information through crowd sourcing. The purpose of this Request for Information (RFI) is to determine the scope and extent of services available in the marketplace to accomplish the crowd sourcing effort.


In addition to its legal, regulatory and policy responsibilities, the Copyright Office is an office of public record for copyright registrations and other documents that pertain to copyright. The records are a historically important snapshot of the culture of the United States, primarily relating to copyrightable expression, authorship, and copyright ownership. They refer to works as diverse as books, photographs, musical compositions, sound recordings, motion pictures, software, and more, including works fixed in electronic formats. The pre-1978 records in the Copyright Office reflect approximately 16.4 million original and renewal registrations dating back to 1870, and approximately 350,000 assignments, transfers and terminations of copyright ownership involving 1.7 million titles. These records are open to the public and are used for many different purposes, including copyright-related commerce, such as licensing, and for historical documentation of authors. Records from 1978 to the present are already available online via the Copyright Office website

There are about 70 million cards and pages among the records from which data will need to be captured to enable online searching and display of the information. The age, condition and formatting of the records limits the use of OCR and therefore much will need to be captured through keyboarding. Some of the data capture tasks will require analysis of the data such as extracting names and titles from the 49 million catalog cards. Some tasks will be semi-analytical such as noting whether a card header is a name or a title. Other tasks will be more specific such as capturing formatted numbers from a set location on similarly formatted cards or pages.

Particular Records in the Office's Possession

There are five primary sets from which records might be selected for crowd source data capture. Digitization of the individual cards and pages has begun with uncompressed TIFF images in 24 bit color at 300 pixels per inch being produced. For crowd source data capture, derivative image files would be made available most likely in JPEG format in color at 300 ppi. The source records include:

1. Copyright Record Books containing the early records of copyright ownership
2. Application forms in bound volumes also referred to as Copyright Record Books
3. Historical Copyright Card Catalog
4. Published Catalogs of Copyright Entries (CCE's)
5. Copies of recorded documents pertaining to copyright ownership

Crowd Sourcing Requirement

A preliminary analysis of the data capture tasks indicates that some may be simple enough to be carried out by persons unfamiliar with copyright information or the format of the records. For example some records are in a ledger style with specific fields of data. Some data capture may involve keying the identifying record number from a page image. Some may require looking at a catalog card header and the context of the card and keying a code that indicates whether the header is a name or a title.

Intellectual Property Rights

The Government shall retain full ownership rights to all deliverables from any future contracts involving the Copyright Digitization and Public Access project including all digital versions of Copyright records, all image files, all data and index files, and all project management and status reports. Such rights shall include both tangible and intangible rights including but not limited to copyright, trademark, patent, trade secret, and unfair competition. The contractor may claim no rights or legal interest in delivered material including electronic files, their content, or the organization structure of the files or their indexes.

RFI Instructions

Interested vendors or organizations should address the following in their submissions:

1. Provide a detailed description of the crowd sourcing service you operate.
2. Are there limits on the volume of records or tasks that can be supported?
3. How many persons are registered with you to carry out crowd sourcing tasks?
4. How do you select and manage persons who are carrying out crowd sourcing tasks?
5. How do you determine whether a task is suitable for crowd sourcing?
6. How do you make a projection of when a set of tasks will be completed?
7. How is payment made to the persons carrying out crowd sourcing tasks?
8. How do you invoice the Government for completed tasks?
9. What administrative fees do you charge?
10. How do you protect against loss of data?
11. Do you provide design and development of software for specific data capture tasks?
12. What size and type image files do you recommend for crowd source data capture?
13. What are your standard quality assurance procedures?
14. What are your standard project management procedures?
15. What are your standard data security procedures?



The records of the Copyright Office referenced in this RFI are public records and may be inspected during regular business hours by interested vendors or organizations in the Madison Building of the Library of Congress in room LM-404. Reader registration is required before access to the records is granted.

Interested parties are requested to provide information on their ability to satisfy any or all of the capabilities outlined above.

101 Independence Ave SE
Washington, District of Columbia 20540-9411
Library of Congress
101 Independence Ave., S.W.

Washington, District of Columbia 20540
United States
Sidney Wise,
Contracting Officer
Phone: 202-707-7620