What we’re working on.
A short list of open questions and the data we use to chase them.
“Can a place's recent temporal embedding predict next month's NDVI before the cloud clears?”
The JEPA objective gives us a target in embedding space. Mapping that target back to a calibrated index is the unfinished part. Approach: small-region calibration over cloud-free reference cells, held out against late Sentinel-2 passes.
“What's the lowest-cost sensor mix that recovers 95% of a fully-instrumented region's signal?”
Most regions cannot afford every sensor. We ablate sensor families on labelled benchmarks and plot cost against retained accuracy.
“How few labeled examples does the LBVM need to bootstrap a new vertical?”
When a customer brings a new classification problem, the embedding should already be most of the way there. We are measuring how far that goes in practice, with k from 5 to 500.
“How small can the edge decoder be before answer quality degrades?”
The orbit/edge split assumes the decoder is cheap. We progressively distil smaller decoders against a fixed prompt set and watch where the first useful answers fall apart.
- Sentinel-1, Sentinel-2. ESA Copernicus, 10m. SAR backscatter and the multispectral imager are the workhorses.
- MODIS. Daily global, 250–1000m. Surface reflectance and land surface temperature.
- Landsat 8/9. 30m, 16-day revisit. Long archive back to 1972.
- Copernicus DEM. 30m global; 10m European subset.
- Overture Maps. Built-up footprints, transportation networks, named places.
- Open weather. 15-minute reanalysis. Temperature, precipitation, wind, radiation.
Open to academic and government partners. avijeet@vortx.ai