Querying the EFD#
The Engineering and Facilities Database (EFD) is a time-series store fed by the Rubin
Observatory Control System via SAL. It is backed by InfluxDB and can be queried through
InfluxQueryClient.
See also the lsst-efd-client for a
full-featured async client; rubin_nights provides a similar
but slightly simpler client, with both
synchronous and asynchronous capability intended to
mimic the basic lsst_efd_client.EfdClient API. Additional documentation
about querying the EFD, including InfluxQL queries, is available
in the lsst-efd-client documentation.
The full EFD schema is documented at ts-xml.lsst.io.
Each SAL component (CSC) exposes commands, events, and telemetry as separate topics.
The lsst.sal.<CSC>.<type>_<name> naming convention is used throughout.
Every index in the EFD is in UTC. Timestamps, when provided for different
topics, are usually UTC, with some notable exceptions in TAI.
Querying a time range#
select_time_series() returns all records for a
topic between t_start and t_end as a DataFrame indexed by UTC timestamp.
Pass "*" to retrieve all fields, or a list to select specific ones.
from astropy.time import Time, TimeDelta
# Specify the start and end of a given day_obs
day_obs = "2025-06-20"
t_start = Time(f"{day_obs}T12:00:00", format="isot", scale="utc")
t_end = t_start + TimeDelta(1, format="jd")
# Retrieve all Scheduler targets issued during the night
targets = endpoints["efd"].select_time_series(
"lsst.sal.Scheduler.logevent_target",
"*",
t_start, t_end,
index=1, # salIndex of the MTScheduler
)
# Retrieve specific fields only
truss_temp = endpoints["efd"].select_time_series(
"lsst.sal.ESS.temperature",
["temperatureItem6", "temperatureItem7"],
t_start, t_end,
index=122,
)
The index parameter corresponds to the SAL index (salIndex) of the CSC instance.
For topics from CSCs that run as a single instance it can be omitted.
Retrieving the most recent records before a time#
select_top_n() fetches the n most recent
records that precede time_cut. This is useful for getting the last known state
before the start of a query window.
last_targets = endpoints["efd"].select_top_n(
"lsst.sal.Scheduler.logevent_target",
"*",
num=3,
time_cut=t_start,
)
Running raw InfluxQL queries#
The query method accepts a raw InfluxQL query string for cases where the helpers
above are not sufficient:
query = (
"SELECT mean(temperatureItem6) AS tma_truss_plus_xy "
'FROM "efd"."autogen"."lsst.sal.ESS.temperature" '
f"WHERE salIndex = 122 "
f" AND time >= '{t_start.utc.isot}Z' "
f" AND time <= '{t_end.utc.isot}Z' "
"GROUP BY time(5s) FILL(null)"
)
result = endpoints["efd"].query(query)
Listing available topics#
topics = endpoints["efd"].get_topics()
Accessing LFA data at USDF#
URIs written into the EFD at the summit use summit-local bucket paths when
recording the filename in the “largeFileAvailable” topics.
connections.usdf_lfa() converts such a URI to one that is accessible from the USDF:
from rubin_nights.connections import usdf_lfa
summit_uri = "s3://rubinobs-lfa-cp/Scheduler/..."
usdf_uri = usdf_lfa(summit_uri)
Joining EFD time series with per-visit data#
A common pattern is to interpolate a continuous EFD time series onto visit timestamps. The example below uses a spline to assign truss temperature values to each visit:
import numpy as np
from astropy.time import Time
from scipy.interpolate import UnivariateSpline
truss_times_mjd = Time(truss_temp.index.values, scale="utc").tai.mjd
spline = UnivariateSpline(truss_times_mjd, truss_temp["tma_truss_plus_xy"].values, s=0)
visits["tma_truss_plus_xy"] = spline(visits["exp_midpt_mjd"].values)
Alternatively, take the mean of EFD readings that fall within each exposure window:
truss_temp["mjd"] = truss_times_mjd
def mean_during_visit(row, ts):
window = ts[(ts["mjd"] >= row["obs_start_mjd"]) & (ts["mjd"] <= row["obs_end_mjd"])]
return window["tma_truss_plus_xy"].mean()
visits["tma_truss_plus_xy"] = visits.apply(mean_during_visit, args=(truss_temp,), axis=1)