Detecting dark matter subhalos with the Nancy Grace Roman Space Telescope

The dark matter subhalo mass function is a promising way of distinguishing between dark matter models. While cold dark matter predicts halos down to Earth-sized masses, other dark matter models typically predict a cut-off in the subhalo mass function. Thus a definitive detection or limits on the existence of subhalos at small masses can give us insight into the nature of dark matter. If these subhalos exist in the Milky Way, they would produce weak lensing signatures, such as modified apparent positions, on background stars. These signatures would generate correlations in the apparent velocities and accelerations of these stars, which could be observable given sufficient astrometric resolution and cadence. The Nancy Grace Roman Space Telescope's Exoplanet Microlensing Survey will be perfectly suited to measuring the acceleration signatures of these halos. Here we forward model these acceleration signatures and explore the Roman Space Telescope's future constraints on lens profiles and populations. We find that the Roman Space Telescope could place competitive bounds on point source, Gaussian, and Navarro-Frenk-White (NFW) profile lenses that are complementary to other proposed methods. In particular, it could place 95% upper limits on the NFW concentration, $c_{200}<10^{2.5}$. We discuss possible systematic effects that could hinder these efforts, but argue they should not prevent the Roman Space Telescope from placing strong limits. We also discuss further analysis methods for improving these constraints.


I. INTRODUCTION
Although we have known about dark matter (DM) for many decades [1][2][3], we have yet to find a particle responsible for its effects. The promise of a weakly interacting particle has so far eluded direct detection efforts [4][5][6][7]. Many other candidate particles have attracted considerable attention (e.g., axions and self-interacting DM); however we have not seen concrete evidence of their unique signatures.
While we have observed many DM signatures, one feature is both universal to all galaxies and discerning of different particle models: the subhalo mass function, which describes the distribution of sub-galactic DM halos within galaxies. From hierarchical structure formation, we expect smaller halos to exist within larger ones [8,9]. The exact properties of the DM (e.g., whether it is warm or cold, or whether it has any interactions with other particles) define the shape of the subhalo mass function and the profiles of these halos. The cold dark matter (CDM) model predicts halos down to Earth-sized masses [10,11]. However, many other models have subhalo mass function cutoffs at higher scales. Thus, if we are able to find evidence for subhalos at small masses, we can rule out certain properties of DM [see Ref. 12, for a review of these issues].
Recently, stellar streams have also emerged as a way of constraining dense halos [e.g. , 20]. For more diffuse halos, strong gravitational lensing has produced interesting constraints [e.g., [21][22][23][24]. Subhalos could also produce astrometric signatures on stars. Initial studies focused on microlensing signatures [25,26], which are not likely to be observable unless DM subhalos are especially compact. More recently, Ref. [27] explored the weak lensing 1 signatures that DM subhalos would induce on stars within the Milky Way. These vary depending on the types of halos (e.g., point source or diffuse). However, all are observed through changes in apparent stellar velocities or accelerations. Follow-up work calculated the power spectrum of these lensing signatures and forecasted interesting constraints from current and future missions [28]. More recent work has also considered the use of machine learning to look for these signatures [29,30].
The Nancy Grace Roman Space Telescope is NASA's next flagship mission after JWST, and it will have the necessary astrometric resolution to look for these weak lensing signatures [28,31,32]. In particular, the Exoplanet Microlensing (EML) survey would have the precision necessary to properly probe this signal. The EML survey should continuously monitor an approximately 2 deg 2 patch near the galactic center for six 72-day periods. The 15 minute cadence over these fields should produce over 40,000 exposures of each star in the region [33][34][35]. Combined with Roman's relative astrometric accuracy of 1.1 mas per exposure, this would create incredibly precise acceleration maps of the 10 8 stars that the EML survey will monitor.
In this paper, we forward model the constraints that the Nancy Grace Roman Space Telescope's EML Survey would place on DM subhalos in the Milky Way. In Section II we summarize the lensing signature we expect. In Section III we discuss our forward modeling approach. In Section IV we present our results, discuss possible systematics, and identify avenues for improved constraints.
The code used in this paper is publicly available at: https://github.com/kpardo/dmsl.

II. THEORY
In this section, we provide a summary of the lensing signatures expected by DM substructure. For a more detailed discussion of these signatures, see Ref. [27].
The apparent change in a star's position caused by one lens is: where c is the speed of light, b is the impact parameter on the sky, D l is the distance to the lens, D s is the distance to the star, and M l is the lensing mass within a cylinder of radius b and height D s . From now on, we take the distant-source limit: D s D l . Then, the apparent velocity change is [27,28]: where˙≡ d dt , v ≡˙ b, and ≡ d db . The apparent acceleration due to a DM lens is then: where: For the rest of this work, we set˙ v = 0. We also ignore any changes to the direction of b.
Note the very strong dependence on the impact parameter here, α ∝ 1/b 3 . This will be important to our statistical formalism.
Examples of the lensing signature for two different mass profiles (NFW and point-source) are given in Fig. 1. The differences in the signatures are due to the dependence on the mass profile derivatives in Equation 3. For an NFW profile, all of the mass-impact parameter ratios (M l /b 3 , M l /b 2 , and M l /b) are approximately the same order of magnitude, with M l /b having the opposite sign as the other terms (for b/r s > 1). The resulting simplifications give a dipole. For the point source case, M l = M l = 0. This allows the quadrupole moment to dominate. These behaviors are easiest to see when using a vector spherical harmonic decomposition [see Ref. 28, for some discussion of this topic].

III. METHODS
In general, we expect there to be some population of lenses that change the apparent accelerations of stars. Each star can be lensed by multiple lenses and the probability of being lensed increases with the number of lenses. Thus, the lens signal for any one star will rely on M l , v l , and b, as well as the total number of lenses, N l . Here we forward model the expected distribution of acceleration signals induced by DM weak lensing. We then perform a Markov Chain Monte Carlo (MCMC) over different lens parameters to find the constraints the Roman Space Telescope would achieve. In the following, we first give the details of our statistical model and then describe our sampling efforts.

A. Statistical Formalism
Upon a first attempt at this problem, we could try to fully forward model the system. This would involve placing the lenses at random with each sample of the MCMC. However, this approach quickly becomes intractable without considerable computational resources. It requires many samples to fully probe the impact parameter distribution at any fixed N l , nevermind when also sampling M l , which sets N l . In addition, this forward modeling would be very sensitive to the number density of simulated stars in the field of view (FOV). To achieve the correct result, a run with the same number as the true FOV would be required. Then the distances from these 10 8 stars to the N l lenses would need to be computed at each sample. In this paper, we instead opt for a semi-analytic approach, which we find reasonably accurate and more computationally efficient. Rather than sampling lens positions, we find the distribution of expected impact parameters given the number of lenses in the field. For each star in the field, we then sample from this distribution to find its impact parameter to the nearest lens. In the following, we explain this process in more detail. Equation 3 gives the apparent acceleration of 1 star from 1 lens. We first consider the likelihood of 1 star's given apparent acceleration due to a population of lenses. We will then generalize to N stars. Throughout, we assume that all lenses have the same mass and that their mass profiles are set solely by this mass.
First, let us consider the apparent acceleration of 1 star from a population of N l lenses with the same mass, M l . We assume that we are in the weak lensing regime. Then the total acceleration from these N l lenses is just the vector sum of the signal from each lens: where α l,i is given by Equation 3.
In general, the closest lens will dominate (since Equation 3 has such a strong dependence on b). Thus, we now only consider the lensing signature from the closest lens to each star. The impact parameter, b, is then taken to always be the minimum impact parameter. Then the above equation simplifies to α l ≈ α l (M l , v l , b min ).
We assume that the number of lenses is uniquely determined by the mass. Specifically, we assume that all of the dark matter is contained in halos of one mass, M l . Then, to conserve the local dark matter density, the number of lenses for a given lens mass is: where ρ DM is the local density of dark matter, A F OV is the area covered by the FOV, and d max is the maximum distance of the lenses. This assumes that the volume probed by the telescope is a square pyramid, which is a fine assumption for small FOVs (in other words, we take the flat sky approximation). We always assume at least 1 lens within the volume. Note that we do not realistically expect lenses within the FOV for the very largest masses here; however, quantifying this involves an assumed subhalo mass distribution. We leave this to future work.
The number of lenses, N l , then sets the impact parameter distribution as follows. First, we assume that the distribution of lenses is isotropic. This allows us to split b into an angular portion, b θ , which will just have a flat prior on [0, 2π), and a magnitude portion, b ≡ b . We now also split b into an angular distance on the sky, β, and a distance along the line of sight, d l . Specifically, we assume the small angle approximation and define b = β × d l . Given the FOV of the Roman Space Telescope (0.28 deg 2 ), the small angle approximation is valid. We assume that the distribution of β is independent of the distribution of d l . The distances to the lenses are given by simply assuming that they follow the dark matter density profile of the Milky Way. In particular, we assume the lenses' radial positions follow a Navarro, Frenk & White [NFW; 36] profile: where r is the distance from the MW center, r s = 18 kpc is the scale radius of the MW halo, and ρ s = 0.003 M /pc 3 is the density at that radius [i.e., we assume the same parameters as Ref. 37]. We also assume that all lenses are within 1 kpc of the Sun. Ref. [28] found that most of the power from DM lenses is constrained within this radius. This constraint also allows us to continue assuming the distant source limit. The β distribution can be found using some geometry and a little bit of numerical integration (see Appendix A for a full derivation). In essence, the calculation is one of finding the probability of the minimum distance between randomly placed points on a plane. It is straightforward to find the probability for the distance between two randomly placed points [38]. This can then be generalized to N points. Then the minimum can be found using logic. In general, it is a function with a long tail toward low β values that peaks near β ∼ (2 √ N l ) −1 l FOV , where l FOV is the side-length of the FOV. Fig. 2a shows the form of the function for various numbers of lenses.
In summary: where δ(x) is the Dirac Delta function, x is the ceiling function, and N l (M l ) is given by Equation 8. The velocity distribution of the lenses is taken to be independent of M l , N l , or b. It is set by the potential of the Milky Way, which we set from observations. As with the impact parameter, we assume that the lenses have isotropic velocities, which allows us to set a uniform prior on the angle of the lens velocity on the sky, p(v θ ) = U [0, 2π). For the magnitude, v l , we assume the same velocity distribution that Ref. [28] assumes: where v l = v l , µ v = 0 km/s and σ v = 220 km/s. In addition, we do not allow the lenses to have velocity greater than the escape velocity of the Milky Way, v < 550 km/s. Finally, the probability of a star having a particular apparent acceleration, given a lens mass M l , lens velocity v l , and impact parameter b, is then: where the factor in front, 1/4π 2 is given by the priors on b θ and v θ . Assuming the measurement errors on the acceleration are Gaussian, it is then straightforward to find the likelihood of a single stellar acceleration measurement given a DM lens with mass M l : where α is the observed acceleration, N (µ, σ 2 ) is a normal distribution with mean, µ, and variance, σ 2 , σ α is the error of the observation, and p( α l |M l ) is given by marginalizing Equation 12 over v l , β, N l , and d l .
For N stars, the full likelihood is then: It is straightforward to generalize this to lenses that have parameters beyond M l . Note that the above likelihood does not explicitly include the possibility of multiple stars being lensed by the same lens. However, some of this signal is preservedsee Section IV for further discussion of this point.
A summary of our likelihood terms is given in Table I.

B. Details of the Sampling
In this work, we focus on three types of mass profiles for our halos: point source, Gaussian, and NFW. In all cases, we assume a log-uniform prior on the mass within [10 0 , 10 8 ]M and sample in log 10 M l .
The point source case is most relevant for MACHOs; however, as we describe in Section IV, we expect microlensing to be a better model for these lenses. Thus, it mostly serves as a test case in this work. The point source case has M (b) = M l for all b and M l = M l = 0.
Unlike the other two cases, the Gaussian profile is not a realistic profile for any DM models. However, it is a commonly used profile due to its analytical nature. We compute the results here in order to compare to previous work. The profile is defined as: where M 0 is the total lens mass and R 0 is a characteristic radius. Here we set M 0 = M l and we sample in log 10 M l and log 10 R 0 . We use a log-uniform prior of [10 −4 , 10 4 ] pc for the scale radius. The NFW profile, given in Equation 9, is the expected profile for DM. We choose to model the profile using the mass, M l , and concentration parameter, c 200 . Thus we replace ρ s with: where ρ c is the critical density of the universe today, and x = 1 + c 200 . The scale radius, r s , is then given by: where M 200 is defined as the mass of a halo with density 200ρ c . Here we set M 200 = M l and sample in log 10 M l . We also sample in log 10 c 200 with a log-uniform prior of [10 0 , 10 8 ].
For the EML survey itself, we assume the following: the EML survey will measure the accelerations of 10 8 stars with sensitivity 0.1 µas/yr 2 [35]. We also assume a square FOV 2 with area equal to the true FOV area, A FOV = 0.28 deg 2 .  We use the emcee package to perform our MCMC. We only explicitly sample in the mass profile parameters described above. However, each sample of log 10 M l sets N l via Equation 8. This then sets the probability distribution for the impact parameter using Equation 10. We sample N times from this distribution. We then sample N times from the v l distribution. The apparent acceleration from lensing is calculated using these sampled parameters for each star via Equation 3. The likelihood is then calculated via Equation 14.

IV. RESULTS & DISCUSSION
Using the forward modeling procedure described above, we forecast the expected limits on DM substruc-ture that the Nancy Grace Roman Space Telescope's EML survey would place from lensing-induced accelerations on stars. Below we describe our results in detail and discuss possible limitations and future work. Our results are summarized in Table II.

A. Results For Different Lens Profiles
Here we discuss our results for 3 different types of lenses: point sources, Gaussian profiles, and NFW profiles. For each of these, we consider a case where these halos make up all of the DM and a case where the fraction of DM contained in these halos is treated as a free parameter.
As a test case, we first consider point source lenses. Note that the effects from these lenses is probably best described with microlensing. However, at large impact parameters they would produce the weak lensing signatures described in this paper. To account for this, we place limits on the maximum apparent acceleration induced by lensing. We limited the maximum acceleration to a signal-to-noise ratio (SNR) of 1 for any single star. In practice, any stars with acceleration signatures at high SNRs would be cut from the data sample anyway.
For both the case where we allow the fraction of the DM in these halos to vary and in the nominal case where all of the dark matter is in these subhalos, we find that the Roman Space Telescope will be able to place constraints at both high and low masses. These limits are comparable to previous MACHO constraints [18,19,39]. The previous MACHO constraints prohibit MACHOs from making up all of the dark matter at any mass. However, the constraints are weakest in the 10 − 100 M range. Those constraints allow for MACHOs to make up to 40% of the DM at 10 M . In the very conservative case we show here, the Roman Space Telescope will place similar constraints using these weak lensing signatures.
Note that the point-source case appears to feature a "detection", despite the use of fake data with no injected signal (i.e., Gaussian acceleration signals). This "detection" is just the weakest point between two limiting cases. At the low-mass end, there are large numbers of lenses, so the probability of a small impact parameter is very large. At the high-mass end, each lens has a large impact on the surrounding stars, since the acceleration signal scales with mass (see Equation 3). The peak in the posterior is the maximum mass at which we expect more than 1 lens within the survey volume. As we describe above, these signals are probably better described via microlensing. The efforts to find planets via microlensing with the Roman Space Telescope [34,40] will also be able to place constraints on MACHOs and other compact objects.
Next, we consider the Gaussian profile case, which is analogous to the "compact" lens case that Refs. [28,37] consider. Fig. 4 shows the results for the EML survey. In Fig. 5 we show the results if we assume the same survey parameters as Ref. [28]'s "WFIRST-Like" survey (similar to the EML, but with a wider field of view). We find somewhat tighter constraints than the results quoted in Ref. [28]. This is most likely due to the use of forward modeling rather than the power spectrum approach. The acceleration signature is not Gaussian, so the power spectrum approach misses part of the signal. Our result suggests that an alternative approach using higher order statistics, such as a bispectrum or trispectrum analysis, would be fruitful.
Finally, we consider the NFW profile case (see Fig. 6). The mass constraints for this case are the least constraining, as is expected given the diffuse nature of these halos. However, we find that the Roman Space Telescope would place interesting constraints on the concentration of these halos. Namely, it would be very sensitive to any halos with c 200 > the results when the fraction of DM in these lenses is also sampled. The orange lines give the results when we assume that all the DM is in these halos. The 2D plot shows the 68% and 95% contours.
FIG. 5: Forecasted Constraints on Gaussian Lenses by a survey like the Roman Space Telescope's EML Survey, but with a larger field of view (f sky = 0.05). The blue lines give the 68% and 95% contours. The dashed black line gives the 95% upper limit from Ref. [28].
for subhalos very close to their host galaxies, simulations have found that c 200 could extend upwards to ∼ 70 [41]. Thus, with only a modest improvement in sensitivity (either through analysis methods or survey strategy), the Roman Space Telescope's EML survey could be sensitive to CDM subhalos within the galaxy.

B. Discussion & Future Work
We now discuss the caveats of this study. First and foremost, the EML survey parameters have not been finalized and are subject to change. As mentioned above, we also make several simplifications (e.g., shape of the FOV) that could marginally effect the results. We discuss the most impactful assumptions below: neglecting binary star systematics, the use of mono-mass populations of lenses, and our modeling of multiple star lensing by single lenses.
One possible systematic is the acceleration from binary star systems. These systems are ubiquitous and could be a serious systematic since they would produce large acceleration signatures. For low-mass binary systems, the most likely semi-major axis is a = 5.3 AU [for a review of binary star systems, see Ref. 42]. This, with M = 0.1 M and a distance of 8 kpc from the observer gives an on-sky acceleration of α ∼ 35 µas/yr 2 , which is many orders of magnitude greater than the expected lensing signal from a population of lenses. However, careful modeling could be used to distinguish the signals. In most cases, the periodic motion of the stellar positions would allow us to discard these sources [see 34, for a similar discussion of how to distinguish binaries from planet signals].
We only consider mono-mass populations of lenses in our work. This allows us to consider the constraining power at different masses more easily, but it is not a realistic scenario. The expected subhalo mass function is largely uncertain, due to tidal effects, baryonic feedback, and DM physics. In addition, the radial dependence and expected profiles of the subhalos can also heavily depend on these factors and vary from the assumptions made in this paper [see, e.g., [43][44][45]. Future work could look into the constraints on the mass function, as well as consider other radial dependencies and mass profiles. Constraints on these properties would let us learn about both DM physics and baryonic feedback.
Finally, as mentioned in Section III, we do not explicitly consider the effects of a single lens affecting multiple stars. This would add extra correlations to the acceleration signal expected in the stars. In some sense, our p(β|N l ) likelihood does account for the particular distribution of impact parameters assuming a fixed number of lenses in the field. The magnitude of the signal should be preserved by our formalism, even if the angular distribution is not. Our analysis here can be taken as a conservative estimate of the true signal.
There are also several ways of improving these constraints. We discuss 2 avenues for further improvement FIG. 6: Forecasted Constraints on NFW lenses by the Roman Space Telescope. The blue lines/areas show the results when the fraction of DM in these lenses is also sampled. The orange lines give the results when we assume that all the DM is in these halos. The 2D plot shows the 68% and 95% contours.
here: 1) using time domain signatures; 2) using observations of other systems to improve constraints.
This paper solely considered the overall acceleration measurements of these stars at the end of the Roman Space Telescope's nominal mission. However, we could also consider the information contained in the time series measurements of these sources. For example, consider a lens 1 kpc from us moving at 300 km/s in front of a background of 10 8 stars distributed across the 2 deg 2 EML survey area. If this lens was directly in front of one star, it would take 60 years to be in the line of sight of a different star. Over 5 years, we still would expect to see a large difference in the lensing pattern from this lens, since the signal scales so strongly with impact parameter. A detailed exploration of this topic is left to future work.
One other way of improving constraints is to use complementary observations to the EML survey. A guest observer program for the Nancy Grace Roman Space Telescope that targets nearby galaxies could yield even better limits, provided the number of stars observed at high astrometric accuracy is sufficient. For example, an M31 survey combined with the Hubble Space Telescope PHAT survey [46] could be incredibly powerful. This would give a longer baseline to the observations and allow us to leverage the time domain signatures we discuss above. However, the EML survey's timing resolution will likely allow it to remain the best survey for observing these lenses.
In this paper, we considered the constraints the Nancy Grace Roman Space Telescope's Exoplanet Microlensing (EML) Survey would place on dark matter substructure through their lensing effects on the apparent accelerations of stars in the Galaxy. We forward modeled the effects of various lens profiles using a semi-analytic framework. While Roman will not be able to place strong constraints on the fraction of DM in these lenses, we found that Roman's EML survey will place competitive bounds on the masses and other profile parameters of point source, Gaussian, and NFW lenses. In particular, the NFW bounds will be complementary to those placed by satellite counts by probing lower masses than is possible with MW satellites [see, for example, 47]. Future work on the timing signatures of these lenses could lead to further improvement on these bounds.
We would like to find the distribution of impact parameters given a random set of DM lenses and stars. First, note that the number of stars does not matter for this distribution -the populations of each are totally separate, and we can consider each star independently. Essentially, each star is a random draw of the impact parameter distribution.
Throughout this appendix, we keep the lenses at a fixed distance from us and only consider their on-sky distribution. The distance and on-sky distributions should be totally independent and the distance can be sampled afterwards.
First consider a toy problem: a 1D line with length a where we place uniformly place our lenses. For now, we place a single lens. Then place a star on this line. It too will be randomly and uniformly placed. The probability of the squared distance between the star and the lens, x, is given by [38]: and 0 elsewhere. Consider N l lenses. Let the random variable X i for i ∈ [1, N l ] be a random variable describing the squared distance of lens i from our one star. As Eq. 3 shows, the acceleration signal depends very strongly on the impact parameter (α ∝ 1/b 3 ). The signal is then mostly determined by the lens closest to the star. Thus, we want to find the probability that the closest lens has a squared distance, x. In other words, we want: p(min(X 1 , X 2 , ..., X n ) = x).
Let us start by working with the cumulative distribution function (CDF), P (min(X 1 , X 2 , ..., X n ) ≤ x). Note that this is equivalent to the probability of any one lens having squared distance less than x. The probability of this is then the converse of the probability that all of the lenses have squared distances greater than x. The probability that any one lens has a squared distance greater than x is: P (any(X 1 , X 2 , ..., X n ) ≥ x) = 1−F (x). For all n lenses this is: P (all(X 1 , X 2 , ..., X n ) ≥ x) = (1−F (x)) n . Finally, the CDF we are looking for: To obtain the probability distribution function (PDF) in this case, we simply take the derivative of Eqn. A2. This gives: p(min(X 1 , X 2 , ..., X n ) = x) = an √ x − ax a √ x − 1 2n (A3) Now, consider the 2D case. To be totally general, we can consider a rectangle with sides a,b, and a ≤ b. For just 1 star and 1 lens, the probability of a given squared distance is [38]: and 0 elsewhere. The generalization to the minimum case is the same as in the 1D case. We numerically integrate Eqn. A4 to obtain it's CDF, G(x). We then apply Eqn. A2 and take the derivative to obtain the PDF.