EXHIBITX BLOG
How to actually use a 291-page congressional hearing transcript
On January 21, 2026, the House Small Business Subcommittee on Rural Development, Energy, and Supply Chains held a hearing titled "Empowering Rural America Through Investment in Innovation." The transcript runs 291 pages.
Those 291 pages contain opening statements from Chairman Jake Ellzey of Texas and Ranking Member Kelly Morrison of Minnesota. Testimony from three witnesses: Chris Crosby, CEO of Compass Datacenters; Kirk Offel, CEO of Overwatch Mission Critical and a Navy submarine veteran; and Dr. Nicol Turner Lee, a senior fellow at the Brookings Institution. Multiple rounds of five-minute Q&A between subcommittee members and witnesses. Prepared written statements running dozens of pages each. And an appendix of letters submitted for the record from eight organizations including the Fiber Broadband Association, NextEra Energy, Loudoun County's Board of Supervisors, Food & Water Watch, and the American Sustainable Business Network.
This is a single hearing from a single subcommittee on a single day. Congress holds thousands of these per session.
For anyone who needs to use the information in a congressional transcript — lobbyists, policy researchers, journalists, attorneys, advocacy organizations — the document's format is the obstacle, not its content.
The format problem
Congressional hearing transcripts follow a rigid structure dictated by the Government Publishing Office. The hearing opens with formalities (pledge, prayer, procedural motions), moves through opening statements from the chair and ranking member, introduces witnesses, proceeds through witness testimony, enters multiple rounds of member questioning under the five-minute rule, and closes with an appendix of prepared statements and submitted materials.
The result is a document that's organized by the order things happened, not by what was said about any particular topic.
This hearing covered data centers, rural economic development, AI infrastructure, energy grid capacity, water usage, workforce training, veterans employment, small business contracting, environmental regulation, national security competition with China, the Inflation Reduction Act, the CHIPS Act, broadband funding, tax incentives, community benefit agreements, stranded asset risk, and more.
Those topics don't appear in neat, labeled sections. They surface in opening statements, resurface in witness testimony, get probed during member questions, appear again in prepared statements, and receive additional treatment in the submitted letters. A single topic might be discussed on page 3, page 12, page 18, page 24, and page 210 — each time by a different speaker with a different perspective.
If you need to know everything this hearing produced about water usage by data centers, you're reading 291 pages. If you want to compare what Crosby said about energy costs with what Turner Lee said about the same topic, you're scanning two separate testimonies and multiple Q&A exchanges. If you need to find a specific claim — like Offel's statistic that 75 percent of AI infrastructure is driven by small businesses, or Crosby's figure that new sales tax business permits in Red Oak, Texas grew from 12 percent to 43 percent annually after construction began — you're hunting through dense verbatim dialogue.
What the hearing actually covered
The substance of this hearing is significant. It captures a live policy debate between industry executives, an academic policy expert, and members of Congress from both parties — each with different interests and different information to contribute.
Chris Crosby described Compass Datacenters' investment in Red Oak, Texas — in Chairman Ellzey's district — as a case study in rural economic transformation. He cited 1,500 construction jobs during a decade-long buildout, over 400 permanent positions, a partnership with Schneider Electric to build a 105,000-square-foot manufacturing facility on campus, and a claim that data centers "pay more so that families and small businesses can pay less" by serving as anchor tenants on an underutilized grid. He described data center campuses as "100-year assets" and compared them to the railroads — arguing that communities that rejected past infrastructure revolutions were left behind.
Kirk Offel framed the data center boom as "the fifth industrial revolution" and made a forceful case that small businesses are its backbone, not a sideshow. He cited a figure: for every $100 billion invested in digital infrastructure, the economy gets $800 million in local tax revenue, 500,000 new jobs, and a $140 billion GDP boost. He warned that if the U.S. doesn't build this infrastructure domestically, it will go to countries with "worse environmental standards, less transparency, and strategic interests directly opposing our own." And he acknowledged directly that the industry failed to communicate early enough: "We didn't clearly explain what we were building, why it mattered, or how communities would benefit, and we relied on the trust we hadn't yet earned."
Dr. Nicol Turner Lee offered the counterweight. She argued that data centers require a national framework for resources, security, permitting, and community benefit. She noted that some tech companies are "greatly benefiting from the explicit support of the President" through expedited permitting and bypassed environmental review. She raised the risk of stranded assets if the AI bubble deflates, warned that rural workers lack training in the trades that data centers require, and pushed for community benefit agreements that give residents transparency and the power to say no.
Ranking Member Morrison introduced the partisan tension directly: the IRA's clean energy tax credits, the Bipartisan Infrastructure Law's water and broadband investments, and the CHIPS Act's domestic manufacturing incentives — all of which she argued are being rolled back. She cited $154 million in proposed cuts to rural water and wastewater grants and the diversion of broadband funding to Starlink.
The appendix adds eight letters from external organizations weighing in on energy costs, environmental impact, broadband access, and community oversight.
This is a lot of material. It's also exactly the kind of material that people cite, quote, and build policy arguments from for months after the hearing.
What an index makes possible
When you feed a congressional transcript into Fast Facts, the platform processes the entire document — opening statements, testimony, Q&A, prepared statements, and appendix materials — and extracts every person, organization, place, policy, statistic, and concept mentioned throughout.
The result is a searchable index where every term links back to the exact passage in the transcript where it appears.
We used the following custom instruction to guide the index:
Extract named speakers (members of Congress and witnesses) and attribute statements to them, organizations mentioned (companies, agencies, advocacy groups), specific statistics and claims cited in testimony, policy references (legislation, executive orders, funding programs like IRA, CHIPS Act, BEAD, Boots to Business), geographic locations, industry terms (data centers, AI infrastructure, energy grid, workforce development), and topics raised during Q&A exchanges. Surface what each speaker said about each topic so the index can be navigated by person or by subject.
Track what any member or witness said
Congressional hearings are conversations. The same person speaks multiple times — during opening statements, during their testimony, during each round of Q&A. Their positions evolve or sharpen across these exchanges.
An index lets you pull up every statement by a specific speaker across the full transcript. Search for "Crosby" and you see his prepared testimony, his answers to Ellzey's questions, his exchanges with other members — all in one view. Compare that against "Turner Lee" and you see where the industry executive and the policy researcher agreed and where they diverged, without reading the entire hearing linearly.
This is particularly valuable for tracking what members of Congress said. Each representative gets five minutes of questioning. The questions they choose to ask reveal their priorities, their concerns, and often their likely positions on upcoming legislation. An index surfaces those exchanges as navigable entries rather than buried dialogue.
Cross-reference topics across speakers
When Chairman Ellzey asks about Texas's attractiveness for data centers, Crosby talks about workforce and regulatory certainty. When Morrison raises energy costs, Turner Lee talks about rising residential electricity bills. When the topic of veterans comes up, Offel speaks from personal experience as a Navy submariner, Crosby mentions partnerships with veteran-focused organizations, and Ellzey references the Boots to Business program he co-led.
In the transcript, these exchanges are separated by pages of other dialogue. In an index, a search for "veterans" or "energy" or "water" pulls every relevant statement from every speaker into a single view. The cross-cutting picture that the transcript buries, the index reveals.
Find specific claims and statistics
Policy researchers and journalists mine congressional testimony for citable claims. This hearing is packed with them:
- 75% of AI infrastructure driven by small businesses (Offel)
- For every direct data center job, up to seven supported in the surrounding community (Ellzey)
- Sales tax business permits in Red Oak grew from 12% to 43% annually after construction (Crosby)
- $5 trillion in data center construction projected over the next five years (Offel)
- 5,400 data centers currently in the U.S. (Offel)
- $154 million in proposed cuts to rural water and wastewater grants (Morrison)
- 82% of data centers use water for cooling (Turner Lee, prepared statement)
In a 291-page PDF, these numbers are needles in a haystack. In an index, they're anchored to the speakers who cited them, retrievable in seconds.
Navigate the appendix
Roughly half the transcript's page count is appendix material — prepared written statements (which are far more detailed than the oral testimony) and submitted letters from organizations that weren't at the hearing but wanted to be on the record.
The Loudoun County letter describes what happens when a community becomes the epicenter of data center development — Northern Virginia hosts the densest concentration of data centers in the world. The Food & Water Watch letter raises environmental concerns. The NextEra Energy letter addresses the energy supply question from a utility's perspective. The Lazard letter provides levelized cost of energy data.
These materials are often more substantive than the hearing itself, but they're buried at the back of a long PDF. An index treats them as part of the same document, surfacing their content alongside the oral testimony so that a search for "water" returns results from the hearing room and the written submissions alike.
Who uses congressional transcripts
Lobbyists and government affairs professionals
Lobbyists track specific committee members, specific policy topics, and specific language that signals where legislation is headed. A single hearing can contain early signals about upcoming bills, regulatory priorities, or funding decisions. An index lets them monitor those signals across dozens of hearings without reading each one cover to cover.
Policy researchers and think tanks
Researchers studying a policy area — rural broadband, energy infrastructure, AI regulation — need to compile what Congress has heard on the topic across multiple hearings, sessions, and committees. An index turns each hearing into a searchable unit that can be queried by topic, yielding the specific testimony and exchanges relevant to a research question.
Journalists
Reporters covering a hearing often arrive late, leave early, or cover it from the transcript after the fact. They need to find the newsworthy exchange — the confrontation, the admission, the specific number. An index surfaces the moments that matter without requiring a full read of the procedural scaffolding around them.
Attorneys and regulatory professionals
Lawyers working on matters affected by legislation or regulation mine hearing transcripts for legislative intent — what Congress was told, what members said they were trying to accomplish, what concerns were raised. This is especially valuable in administrative law, where agency action can be challenged based on whether it's consistent with congressional direction.
Advocacy organizations
Groups that submitted letters for the record or testified at a hearing need to track how their positions were received. Groups that didn't participate need to monitor what was said about issues they care about. An index makes both tasks manageable.
The scale problem
This was one hearing, on one topic, before one subcommittee. The full Committee on Small Business holds dozens of hearings per session. Congress as a whole produces thousands of hearing transcripts, each running hundreds of pages.
No one can read all of them. But the people who need information from them — about a specific policy, a specific member's position, a specific industry's concerns — need to find it reliably and quickly. The GPO publishes transcripts as downloadable PDFs. They're public. They're free. And they're nearly impossible to search.
An index changes the economics of using this material. Instead of choosing between reading a 291-page transcript and ignoring it, you can index it and search it — finding the three pages that matter for your work without wading through the other 288.
Congressional hearings are the public record of how policy gets made. An index makes that record usable.
Fast Facts is an AI-powered document indexing platform. Upload any document — congressional transcripts, regulatory filings, policy reports, legal proceedings — and get a structured, searchable index in minutes. Try it free.