CasePredictor
DataCasePredictor Editorial

Where USCIS Processing-Time Data Comes From

Good immigration analytics starts with traceable public inputs. This guide walks through the official USCIS and State Department datasets behind CasePredictor and explains what each source can tell you, what it cannot tell you, and how the site turns those inputs into static pages and tools.

Sourcing and corrections follow our editorial standards.

A processing-time site is only as useful as the data discipline behind it. If readers cannot trace the numbers back to public sources, then every chart eventually turns into a trust problem.

That is especially true in immigration, where people make expensive, high-stakes decisions from incomplete information. A site does not need to be perfect to be useful, but it does need to be clear about where each number comes from and where the gaps are.

CasePredictor was built around that constraint. The goal is not to invent a secret dataset. The goal is to reorganize the official datasets into something ordinary filers can actually use.

How this connects to the site

One of the site's core value propositions is that every major number comes from a public source, not from scraped forum anecdotes. This post makes that provenance explicit so users can judge the tool on transparent inputs rather than opaque claims.

The current USCIS processing-times tool

The most obvious input is the public USCIS Processing Times tool. This is where the current P50 and P93 values come from for each tracked form and subtype.

Those numbers are useful because they anchor what the queue looks like right now. They answer the 'what is the published typical and slower wait today?' question better than any historical average can.

But they also have limits. They are aggregate percentiles, not case-level data. They do not tell you why a particular case is delayed, and they do not, by themselves, capture whether the form has been speeding up or slowing down over time.

The historical processing-time archive

To understand direction, you need historical snapshots. That is why the site also uses USCIS's historical processing-time archive. Those snapshots make it possible to chart how medians moved from one reporting period to the next.

This historical layer is what powers trend overlays, year-over-year comparisons, and the 'faster' or 'slower' context that sits alongside an ETA. A raw current percentile tells you where the queue is now. Historical data helps you judge whether the queue has been drifting toward improvement or regression.

Historical data still does not give per-case causation, but it gives something the single current page cannot: momentum.

Quarterly USCIS volume reports

Current timing and historical timing are not enough by themselves. The site also uses USCIS quarterly datasets that report receipts, approvals, denials, and pending counts by form.

These reports are important because they let you see the queue from an operational angle. Is the pending caseload growing? Is adjudication volume keeping up with receipts? Are approvals concentrated in a form that looks fast on the percentile page but is actually absorbing a huge backlog?

That volume layer is what turns the site from a simple wait-time mirror into a broader throughput dashboard. It gives context for why processing times may be changing, not just the fact that they changed.

Visa Bulletin, inventory, and demand datasets

Employment-based and family-preference timelines require a second family of data: the Department of State Visa Bulletin plus USCIS inventory and demand reports where available.

The Visa Bulletin shows cutoff movement over time. Inventory and demand datasets help answer a different question: how many cases appear to be ahead in line by country, category, and in some cases priority-date band. Together, those sources let the site estimate whether a wait is driven mainly by adjudication pace or by the visa queue itself.

This is one of the most important distinctions on the whole site. Without the bulletin and queue data, an I-485 prediction for a backlogged category would be incomplete because it would treat USCIS adjudication time as if it were the whole story.

Why the site refreshes monthly and ships static JSON

The site is a static export, not a live server application. That means the public datasets are refreshed on a schedule, written into versioned project data, and then shipped as static assets with the next build.

That architecture is deliberate. Static assets are cheap to serve, easy to cache, and predictable. It also makes the data lineage easier to audit because the build is based on concrete checked-in files rather than live runtime requests to third-party services.

From an SEO and reliability standpoint, that is a useful tradeoff. The pages stay fast and crawlable, while the underlying numbers still update on a regular cadence that matches how often the official sources themselves tend to move.

What the data can and cannot tell you

Public USCIS and State Department datasets are strong enough to support grounded analytics, but they are not omniscient. They do not explain every individual RFE, transfer, interview hold, or background-check delay. They also do not expose all the internal routing logic that shapes real-world adjudication order.

That is why good immigration analytics should be transparent about both coverage and limits. A responsible site can tell you how comparable cases have been moving, whether the queue is expanding or contracting, and whether visa-number availability is likely to be the binding constraint. It cannot promise the exact day your officer will open the file.

In other words, the data is strong enough to improve judgment, not strong enough to eliminate uncertainty. For most users, that is still a very meaningful upgrade over reading one official range in isolation.

Related posts

<- Back to all blog posts