Accessibility Information

Users of assistive technologies such as screen readers should use the following link to activate Accessibility Mode before continuing: Learn more and Activate accessibility mode.

RFI - Approach to Developing Description and Authority Services

Solicitation Number: NARA_ARC_RFI
Agency: National Archives and Records Administration
Office: NAA
Location: Acquisitions Division
  • Print
:
NARA_ARC_RFI
:
Presolicitation
:
Added: Mar 16, 2009 8:01 am
Background



The Archival Research Catalog (ARC) is NARA's centralized system for describing materials held throughout the agency's nationwide system of archival repositories. ARC is composed of two subsystems: ARC Client/Server (C/S), which is the data entry system accessible only to NARA staff and approved contractors, and ARC Web, which is the public read-only version of the data entered and maintained in ARC C/S.



As an integrated system, ARC enables NARA staff to intellectually describe and control archival holdings at appropriate descriptive levels for all media. As NARA’s primary, centralized data discovery tool, ARC ensures internal and external users are able to discover information about NARA's holdings through keyword searching, as well as by provenance and other access points that are controlled by the ARC authority files. Through the implementation and quality control of uniform data standards, NARA staff are able to efficiently describe holdings in ARC, and public users are ensured consistent, relevant data retrieval and are able to determine the scope and location of NARA's holdings. The number of links to online digital copies of NARA holdings in ARC, as well as electronic records maintained in the Access to Archival Databases (AAD) system, are increasing at a greater and greater rate. With the number of electronically available records growing at an exponential rate, ARC no longer simply provides information about holdings, but increasingly provides direct access to the holdings themselves.



ARC C/S was developed by customizing a Commercial Off-the-Shelf (COTS) product known as ‘OLIB’ to meet NARA’s requirements. The system was customized by Fretwell Downing Informatics (FDI), which is now part of OCLC. ARC C/S was launched in production in 2003 and has received several software application upgrades to correct system deficiencies and provide enhancements to meet the continuing descriptive needs of NARA staff. ARC C/S now supports more than 500 end users. The production architecture consists of a primary, standby and training server, all SUN V-440s, running Oracle 10g Enterprise Edition and Solaris 10. Failover between the primary and standby servers is configured using Oracle’s DataGuard. Remote user access to the ARC C/S servers is accomplished by load balancing users among five Windows 2003 Terminal Servers.



ARC Web was initially launched in 2002 and was re-engineered in 2008. The production architecture consists of three load balanced SUN V20Z web servers which connect to an Oracle RAC instance running on three Sun V-440 database servers. The data transfer process, which transfers new and updated descriptions from ARC C/S to ARC Web, and can run in full and update transfer mode, is accomplished using a Sun V-440 indexing server running a Sunopsis ETL tool.



In addition to the two major components of ARC (C/S and Web), there are two standalone tools to allow for the bulk import and approval of archival descriptions:



ARC Import Tool: allows ARC staff to upload item and file unit descriptions from non-standard, legacy finding aids and from NARA’s digitizing partners. This tool requires that ARC staff first format the data into a standard XML format. Currently, the descriptions are imported as “draft” descriptions that must be approved prior to publication in ARC Web.



ARC Synchronization Tool: allows ARC staff to approve/synchronize the imported archival descriptions so that they can be transferred to ARC Web. This tool was developed so that staff no longer needed to manually approve individual or small batches of uploaded archival descriptions. However, it still takes about 1 system hour to approve/synchronize 10,000 descriptions for ARC Web publication.



The ARC C/S physical data model and source code, as well as the data transfer code, are proprietary to and remain the property of OCLC. NARA owns the ARC Web searching and display source code.



As mentioned above, ARC maintains information about and links to digitized holdings that are maintained either on NARA media servers or are maintained by NARA's digitizing partners. There are currently 147,433 digital copies available in ARC. ARC also supports linking to several object data types, including:



ASCII Text

Audio/Visual (RealMedia Video Stream)

Audio/Visual File (AVI)

Audio/Visual File (MOV)

Audio/Visual File (MP4)

Audio/Visual File (WMV)

Image (BMP)

Image (GIF)

Image (JPG)

MS Excel Spreadsheet

Microsoft PowerPoint Document

Microsoft Word Document

Microsoft Write Document

Portable Document File (PDF)

Sound File (MP3)

Sound File (WAV)

World Wide Web Page



ARC also includes links to electronic records that are available via NARA's Access to Archival Data (AAD) system and links to internal and external web pages. AAD is available online at http://aad.archives.gov/aad/



ARC Statistics



ARC currently contains 2,478,259 archival descriptions, broken down as follows:



500 Record Groups

2,299 Collections

83,030 Series

2,102,311 File Units

290,119 Items



There are also 6,244,437,111 logical data records and 354,665 artifacts described in ARC.



ARC contains 8,810,938 authority records, broken down as follows:



Specific Records Types: 8,902

Topical Subjects: 22,205

Person Authorities: 4,800,766

AACR2 Organizations: 1,712,421

AACR2 Meetings: 222,224

Organizations: 980,222

Geographic locations: 1,064,198



On average, 25,000 archival descriptions are added to ARC each week. This includes both new descriptions created by NARA’s program offices as well as legacy finding aid descriptions uploaded using the ARC import tool. New and updated descriptions are transferred to ARC Web twice a week.



Shared Service Approach & ARC Logical Data Model: SEE ATTACHMENTS TO THIS NOTICE.



Statement of Need



The existing ARC C/S system has enabled NARA to add more than 2.5 million archival descriptions and to make those descriptions available to the public via the ARC Web system. However, the existing system lacks the scalability necessary to support the millions of archival descriptions that are resulting from NARA’s digitizing partnerships. Due to the rapid processing pace of NARA’s digitizing partners, there’s already a backlog of several million descriptions that will be associated with 11 million digital copies, and 30 million more digital copies expected to be delivered this year. NARA is unable to import the corresponding archival descriptions into ARC C/S because of the impact it will have on the following:



1. The amount of time it takes to complete a full data transfer from ARC C/S to ARC Web will increase substantially. A full transfer of 2,478,259 archival descriptions currently takes 72 hours to complete. It is predicted that the full transfer will take 240 hours to complete against a dataset of 11,000,000 records.

2. The time window for copying data from the C/S system to the Intermediate Web database will increase. Currently this is taking 2 hours and it’s possible it would take up to 10 hours with a dataset of 11,000,000 records. This will affect both the full and update data transfer processes and will impact the amount of off-hour system processing time available to run backups, conduct a nightly re-indexing job and to accommodate update and full data transfers.

3. The addition of 11,000,000 archival descriptions could impact C/S system performance (i.e., searches, moving between data entry tabs in data entry layouts, etc.)



NARA has outgrown the existing ARC system and requires a more robust solution that’s capable of scaling to support at least 250 million archival descriptions and links to upwards of 500 million digital copies over the next 4-7 years.



Description:



NARA is exploring approaches to developing descriptive and authority services to replace the existing ARC C/S system and data transfer processes. Services are pieces of business software that are constructed and interoperate in a specific manner. They are collections of software components that have industry-standard Application Programming Interfaces (APIs) and perform a discrete business function. At execution time, services are accessed and operate across application boundaries (or even across organizations) in the context of an SOA, preferably leveraging Enterprise Service Bus (ESB) infrastructure to assure the quality and reliability of the software.



Because services are potentially used across a wide range of applications and environments, they must assure semantically consistent business identity and completeness for the functionality they encapsulate. A service is not intended to be a complete business application or business transaction, but rather a piece of business functionality that is designed to be reused in multiple business contexts and participate with other services and applications to support business functional requirements. NARA will provide all pertinent business process descriptions to vendors, but will hold vendors’ architects responsible for understanding service requirements and making service granularity decisions based on a set of well understood design criteria (vendor defined and NARA approved).



Service development is predicated upon the concept of a multi-tier software architecture that separates business logic from presentation and data management logic, and then packages the business logic such that it is externally accessible via a well-defined interface. This approach allows for loose coupling of application functionality that can be synchronously or asynchronously invoked based upon the needs of the application and the capabilities of the supporting software infrastructure. Figure 1 below provides a conceptual overview of the Description and Authority Services concept. See NARA’s Technical Reference Model (Appendix A) for detailed descriptions of the concepts of software modularity, coupling, cohesion, and interface invocation.



VENDOR DAY APRIL 6, 2009



NARA will hold a Vendor Day on April 6, 2009 from 9:30 am to 3:00 pm at Archives II (8601 Adelphi Road, College Park, MD 20740). The Vendor Day is for the purposes of demonstrating the system and answering vendor questions prior to firms submitting their RFI responses. Firms that would like to attend Vendor Day should RSVP no later than COB March 30, 2009 to the Contracting Officer Anne Hasselbrack at anne.hasselbrack@nara.gov with the subject: VENDOR DAY.



• Please provide the Business Name, Business Size per NAICS 541512, Person’s Name, email and phone number for no more than TWO ATTENDEES per firm.



NARA will email all firms by April 3, 2009 with logistics information concerning their attendance.



Response Instructions



This RFI is for NARA’s planning purposes only, and implies no commitment on NARA’s part to acquire any products or services, or to adhere to a particular acquisition strategy should we in fact go forth with an acquisition. NARA does not intend to technically evaluate responses for purposes of later excluding interested parties (such as a “downselect”), except in the case of a decision to restrict participation to only certain socio-economic classes; no formal evaluation will occur. NARA will not reimburse any costs associated with vendors’ responses.



Any firm (including those that were unable to attend Vendor Day) that wishes to submit a response may do so by 5:00 PM April 24, 2009 to the Contracting Officer Anne Hasselbrack, at anne.hasselbrack@nara.gov with a subject of: "NARA Descriptive and Authority Services Response".



NARA is seeking responses from vendors capable of providing a total solution. Responses shall be no more than 25 pages, and include:



• Full contact information including business size for NAICS 541512 and socio-economic status

• Approach or solution to developing discrete descriptive and authority services designed to be reused in multiple business contexts and participate with other services and applications to support business functional requirements. The descriptive and authority services will need to participate with the following NARA applications:

a. Electronic Records Archives system (ERA)

b. Holdings Management System (HMS)

c. Online Public Access (OPA)

d. ARC Web

ARC Web is online at http://arcweb.archives.gov/arc/action/BasicSearchForm

• Estimated price

• Whether the solution is proprietary or not, and the means and degree of its accessibility

• Experience implementing the proposed solution for other customers

• Any contracting vehicles (FSS, GWAC, etc.) under which the solution is available

• Supplemental capabilities statements may be submitted, but are included in the 25 page maximum









Please consult the list of document viewers if you cannot open a file.

ARC Logical Data Model

Type:
Other (Draft RFPs/RFIs, Responses to Questions, etc..)
Label:
ARC Logical Data Model
Posted Date:
March 16, 2009
Description: ARC Logical Data Model
Description: Shared Service Approach
:
8601 Adelphi Road, Room 3340
College Park, Maryland 20740-6001
:
NARA - Archives II
8601 Adelphi Road
College Park, Maryland 20740
United States
:
Anne M Hasselbrack,
Contracting Officer
Phone: 301-837-0521
Fax: 301-837-3227