# An AC-Coupled Wideband Neural Recording Front-End With Sub-1mm<sup>2</sup>×fJ/conv-step Efficiency and 0.97 NEF Arda Uran, Student Member, IEEE, Yusuf Leblebici, Fellow, IEEE, Azita Emami, Senior Member, IEEE, and Volkan Cevher, Senior Member, IEEE Abstract—This paper presents an energy-and-area-efficient AC-coupled front-end for multichannel recording of wideband neural signals. The proposed unit conditions local field and action potentials using an inverter-based capacitively-coupled low-noise amplifier, followed by a perchannel 10-bit asynchronous SAR ADC. The adaptation of unit-length capacitors minimizes the ADC area and relaxes the amplifier gain so that small coupling capacitors can be integrated. The prototype in 65nm CMOS achieves $4\times$ smaller area and $3\times$ higher energy-area efficiency compared to the state of the art with $164~\mu m \times 40~\mu m$ footprint and $0.78~mm^2\times fJ/conv$ -step energy-area figure of merit. The measured $0.65~\mu W$ power consumption and $3.1~\mu V_{rms}$ input-referred noise within 1Hz-10kHz bandwidth correspond to a noise efficiency factor of 0.97. Index Terms—AC coupling, front-end, inverter-based amplifier, monotonic switching, neural recording, successive approximation register analog-to-digital-converter, unit length capacitor. #### I. INTRODUCTION Recording and decoding high-frequency neural features through intracortical brain-computer interfaces has allowed accurate control of complex actuators [1]. Moving from laboratory demonstrations to widespread use of such systems requires combining signal acquisition and processing in an implantable system on chip [2], [3], under stringent energy and size constraints. Therefore, accommodating the maximum number of electrodes requires the lowest energy-area cost per recording channel. Neural signals bear information within frequency bands called local field potentials (LFP) up to $\sim\!300$ Hz, and action potentials (AP) between 300 Hz and 10 kHz, with amplitudes typically ranging from a few $\mu$ V to a few mV [4]. A generic neural recording front-end boosts, filters, and digitizes these signals using a low-noise amplifier (LNA) followed by an analog-to-digital converter (ADC) as depicted in Fig. 1(a). Traditionally, the electrodes are AC-coupled to the front-end to reject large and variable DC offsets building on the tissue-electrode interface [5]. The coupling capacitor also presents a high input impedance to the electrode, and serves as an isolation layer against static device currents and short circuits [6]. Nevertheless, the capacitor area becomes a bottleneck when a high gain and a low high-pass pole is required [7], limiting the scalability of this approach. Several DC-coupled techniques have been proposed to eliminate bulky coupling capacitors at the expense of other circuit qualities. Direct digitization (Fig. 1(b)) by removing the LNA results in excessive input-referred noise [8]. Canceling the offset through a mixed-signal feedback loop (Fig. 1(c)) [7] has a digital power and area overhead that scales up with the desired input range. Delta modulating the input with switched capacitors (Fig. 1(d)) can achieve This paragraph of the first footnote will contain the date on which you submitted your brief for review. This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement n° 725594 - time-data), and from Hasler Foundation (project number 16066). - A. Uran and V. Cevher are with École polytechnique fédérale de Lausanne, 1015 Lausanne. Switzerland. e-mail: arda.uran@epfl.ch. - Y. Leblebici is with École polytechnique fédérale de Lausanne, 1015 Lausanne, Switzerland, and Sabanci University, Istanbul, Turkey - A. Emami is with California Institute of Technology, 91125 Pasadena CA. Fig. 1. (a) AC-coupled front-end architecture and DC-coupled area reduction techniques using (b) direct digitization, (c) mixed-signal offset cancellation, (d) switched-capacitor delta modulation, and (e) electrode multiplexing. rail-to-rail cancellation at the cost of reduced input impedance and increased noise at high sampling rates [9], which is also the case when the electrodes are multiplexed into a shared front-end (Fig. 1(e)) [10]. Moreover, the injection of switching currents into the neural tissue raises safety concerns which has not yet been addressed. Due to these fundamental limitations of DC coupling, AC-coupled frontends are still favorable for wideband recording thanks to low noise, passive offset rejection, and high input impedance. However, the area efficiency of this approach has to be improved to allocate more resources to other implant features. In this work, we present an AC-coupled neural recording front-end architecture which permits high integration density without compromising power efficiency or sampling frequency. Our adaptation of unit-length capacitor (ULC)-based successive approximation register (SAR) ADC [11] minimizes the ADC area and relaxes the LNA gain requirement, which in turn makes room for small coupling capacitors. The inverter-based LNA achieves high noise efficiency thanks to current reuse. The constant common-mode monotonic ADC switching scheme further improves the energy efficiency. Our prototype in 65nm CMOS process achieves the smallest AC-coupled footprint reported in the literature with 6560 $\mu m^2$ , which is also smaller than or comparable to the recent DC-coupled implementations. The overall energy-area figure of merit (E-A FoM) measuring 0.78 mm²×fJ/conv-step, as well as the noise efficiency factor (NEF) measuring 0.97 within 1 Hz-10 kHz bandwidth, are below the state of the art. # II. DESIGN DETAILS The proposed front-end follows the conventional approach depicted in Fig. 1(a). A capacitively-coupled LNA (CC-LNA) boosts and filters neural signals within LFP and AP bands, followed by a 10-bit 20 kS/s ULC-based SAR ADC. The following subsections provide detailed descriptions of each part. #### A. Capacitively-Coupled Low-Noise Amplifier Fig. 2(a) shows the CC-LNA schematic. The nominal gain of the amplifier is set by the capacitor ratio $C_{AC}/C_{FB}$ as 40 dB, based on the ADC input range and the expected recording site noise as will be discussed in Section II-B. We choose $C_{AC}$ as 2 pF, and $C_{FB}$ Fig. 2. (a) Fully-differential capacitively-coupled LNA. (b) Transistor-level schematic of the two-stage inverter-based OTA. Fig. 3. (a) Detailed schematic of the ULC-SAR ADC showing the bootstrapped switch and constant common-mode ULCDAC circuits explicitly. (b) ULC dimensions following the naming convention in [11]. (c) Physical implementation of the differential constant common-mode ULCDAC. can be configured from 15 fF to 65 fF corresponding to 30 to 40 dB programmable gain. Diode-connected PMOS feedback resistors (simulated nominal $R_{FB}$ = 50 $T\Omega$ ) ensure that the high-pass cut-off at $1/(2\pi R_{FB}C_{FB})$ stays below 1 Hz for all configurations of $C_{FB}$ across corners. The low-pass cut-off can be adjusted by changing the bias current $(I_B)$ of the operational transconductance amplifier (OTA). The fully-differential OTA employs the two-stage inverter-based topology shown in Fig. 2(b). This topology has high noise efficiency thanks to current being reused by the complementary differential pairs. All pairs are constructed with thick-oxide transistors and operate in weak inversion regime, hence the open-loop gain depends mostly on transistor lengths. The first stage provides 38 dB gain, the second stage adds 25 dB and drives the ADC input. The output common-mode is sensed via diode-connected transistors and set nominally to $V_{\rm DD}/2$ by a simple common-mode feedback (CMFB) amplifier. Compensation capacitors ( $C_{\rm CI}$ =240 fF, $C_{\rm C2}$ =35 fF) and the ratio of tail currents ensure stability in all corners. # B. ULC-based Asynchronous SAR ADC The amplified signals are digitized by the 10-bit 20 kS/s asynchronous SAR ADC shown in Fig. 3(a). The key area advantage of the ADC is the unit-length capacitor-based digital-to-analog converter (ULCDAC), first proposed in [11]. A ULC relies on the difference of two metal capacitors (C - C' = 2C\_ $\Delta$ ) with unit length difference $\Delta$ , such that a binary weight can be realized by adjusting $C_\Delta$ rather than multiplying the capacitor structure. As a result, the total array requires N ULCs instead of $2^N$ unit capacitors. Moreover, the ULCDAC can be constructed using few metal layers and placed above the other circuits. Fig. 3(b) illustrates the ULC used in this implementation with its construction parameters following the conventions in [11]. The plates are constructed in M6 and M7 layers in parallel, which results in $0.1~{\rm fF}~{\rm C}_{\Delta}$ . Our adaptation of ULCDAC shown in Fig. 3(c) leverages the constant common-mode monotonic switching scheme [12] to reduce the switching activity while maintaining a constant common-mode voltage for the comparator input. This requires a second ULCDAC switching in the opposite direction, but brings no area overhead as the single ULCDAC size is halved thanks to monotonic switching. As a result, the overall differential DAC consists of 4 single-ended ULCDACs, each of which is partitioned into 4-bit binary LSB and 4-bit unary MSB arrays to keep mismatch under control as explained in [11]. The remaining two bits are implicit as the first comparison is performed on the sampled input, and the last two bits are implemented single-ended. Although the benefit of energy-efficient switching on the small capacitance of ULCDAC is negligible [11], we observe 10% reduction in total simulated ADC power thanks to less activity in the asynchronous logic block. Another feature of ULCDAC useful for our application is that, due to the total capacitance $(\Sigma(C+C'))$ being higher than the total effective differential capacitance $(\Sigma(C-C'))$ [11], the input range of the ADC decreases from 2 V to 0.65 V peak-to-peak differential without needing a separate supply or dedicated reference generators. Fig. 4. Asynchronous logic for monotonic switching algorithm. Fig. 5. The annotated micrograph of the 164 $\mu m \times 40 \mu m$ prototype. The resulting 0.6 mV quantization step allows us to choose 40 dB LNA gain to capture signal content above the 5 $\mu$ V<sub>rms</sub> expected measurement noise floor [6]. The switches, comparator, and DAC drivers are placed below the ULCDAC with shielding in M4 and M5 layers. To reduce sampling nonlinearity, bootstrapped track and hold switches were used where a PMOS precharge capacitor is used to save area. The comparator employs a dynamic double-tail architecture [11]. Fig. 4 shows the asynchronous logic for the proposed ADC. Upon a rising external clock edge, the input is sampled ( $\Phi_s$ ), and the chain of 10 stages is initiated. Each stage clocks the comparator ( $\Phi_c$ ), then monotonically switches the ULCDAC depending on the decision. The conversion takes about 100 ns, and the ADC remains idle until the next clock edge which displays a low average power consumption when sampling rate is in the order of kS/s. The circuit is a hybrid implementation of dynamic and static CMOS logic to achieve low-power and robustness. ## III. MEASUREMENT RESULTS The annotated micrograph of the fabricated prototype in 65nm 6X1Z1U LP CMOS process is given in Fig. 5. The total channel area is $164 \ \mu m \times 40 \ \mu m$ (6560 $\mu m^2$ ) and the ULC-based SAR ADC takes $45 \ \mu m \times 35 \ \mu m$ (1575 $\mu m^2$ ). Fig. 6(a) and Fig. 6(b) verifies the desired amplifier response for different gain and bandwidth settings, respectively. The gain can be modified between 30.8 dB to 40.1 dB by trimming the feedback capacitor $C_{FB}$ . The high-pass cut-off frequency is around 0.05 Hz. The bandwidth can be adjusted between 500 Hz and 10 kHz by changing the bias current, which corresponds to 30 nA and 600 nA total supply current from 1 V supply, respectively. The total harmonic distortion (THD) is 1.1% when 8 mV peak-to-peak differential input applied with maximum gain, and the common mode rejection ratio (CMRR) is 56 dB. As shown in Fig. 7, the input-referred noise (IRN) over the maximum bandwidth is 3.1 $\mu V_{rms}$ . Fig. 6. Measured LNA Bode plots for different (a) gain and (b) bandwidth configurations. Fig. 7. Measured input-referred noise spectrum of the front-end. Fig. 8. Measured output frequency spectrums for (a) the ADC alone (b) and the overall channel. The ADC consumes 47 nW when operating at 20 kS/s, and the consumption scales linearly with the sampling rate. The input range of the ULC array was measured to be 0.69 V peak-to-peak differential. The standalone ADC is able to achieve 75 dB spurious-free dynamic range (SFDR) and 9.2 effective number of bits (ENOB) over the entire bandwidth up to 2.5 MS/s. The output spectrum of the complete channel in comparison with the standalone ADC response is given in Fig. 8. The channel performance reduces to 8.1 ENOB and 68 dB SFDR when the LNA is connected. This is due to the LNA increasing the overall noise floor, and the increased nonlinearity due to the settling of the LNA output being slightly slower than expected. The histogram in the inset of Fig. 8 depicts the interdie variation of ENOB across 15 measured samples which reflects the ULC mismatch. These could be improved with a small resource penalty by increasing the LNA drive strength and increasing ULC $\Delta$ if necessary, but the measured worst-case performance still facilitates the 50 dB neural dynamic range [7]. Table I summarizes the performance of the proposed neural recording front-end and compares it with the previous works [4], [7]–[10], [13]–[15]. The silicon footprint is four times smaller than the smallest AC-coupled front-end reported, and it is comparable to the recent DC-coupled ones. The power consumption is the lowest | TABLE I | |-------------------------------------------------------------------| | SYSTEM SUMMARY AND COMPARISON WITH STATE-OF-THE-ART RECORDING ICS | | | JSSC'12 [7] | JSSC'15 [13] | JSSC'17 [9] | JSSC'18 [4] | SSCL'18[14] | JSSC'18 [8] | TBCAS'19 [15] | TBCAS'20 [10] | This work | |-----------------------------------|---------------------|--------------|---------------------------|---------------------------|-------------------------|---------------------|---------------|--------------------|-----------| | Technology | 65nm | 65nm | 130nm | 180nm | 65nm | 180nm | 180nm | 65nm | 65nm | | V <sub>DD</sub> [V] | 0.5 | 1 | 1.2 | 0.5 / 1 | 0.6 | 1.8 | 0.5 | 2.5 / 0.5 | 1 | | Coupling | DC | AC | DC | AC | DC | DC | AC | DC | AC | | Multiplexing | No | Yes | No | No | No | No | No | Yes | No | | Bandwidth [Hz] | 10-10k | 10-8k | 0.01-500 | 0.4-10.9k | 0.1-500 | 0-10k | 1-6.8k | 1-1k | 0.05-10k | | Sampling rate [kS/s] | 20 | 20 | 1 1 | 25 | 1 | 20 | 31.25 | 2 | 20 | | IRN [μVrms] | 4.9 | 7.5 | 1.13 | 3.32 | 2.2 | 12.07 1 | 5.4 | 1.66 | 3.1 | | NEF/PEF | 5.99/17.96 | 4.45/12.9 | 2.86/9.82 1 | 3.02/4.56 | 8.7/45.4 | 29.1 1/1529 1 | 2.99/4.46 | 2.21/12.21 1 | 0.97/0.94 | | ADC topology | VCO | SAR | $\Delta$ - $\Delta\Sigma$ | $\Delta$ - $\Delta\Sigma$ | VCO ΔΣ | ΟΤΑ-C ΔΣ | Δ-SAR | Δ-SAR | SAR | | Area/Channel [mm <sup>2</sup> ] | 0.013 | 0.0258 | $0.013^{2}$ | $0.058^{3}$ | 0.012 | $0.0049^{2}$ | 0.16 3 | $0.0023^{3}$ | 0.00656 | | Power/Channel [µW] | 5.04 | 1.84 | $0.63^{2}$ | 3.05 3 | 3.2 <sup>2</sup> | 39.14 <sup>2</sup> | $0.88^{3}$ | 2.98 <sup>3</sup> | 0.65 | | Resolution [ENOB] | 7.2 4 | 8.2 4 | 11.7 <sup>1</sup> @130Hz | 10.3@1kHz | 8.18 <sup>1</sup> @40Hz | 8.2@1kHz | 7.7 4 | 8 5 | 8.1 | | ADC FoM <sub>W</sub> [fJ/c-s] | 84 | 4.25 1 | n/a | 35.2 | n/a | n/a | 19.6 | 23.32 1 | 4.0 | | Channel FoM <sub>W</sub> [fJ/c-s] | 1713.9 <sup>1</sup> | 312.86 | 189.36 <sup>1</sup> | 108.83 | 11034 1 | 7180 | 135.43 | 5820 <sup>1</sup> | 118.5 | | E-A FoM [fJ mm <sup>2</sup> /c-s] | 22.28 1 | 8.07 1 | 2.46 1 | 6.34 | 110.34 1 | 35.182 <sup>2</sup> | 21.67 1 | 13.39 <sup>1</sup> | 0.78 | <sup>&</sup>lt;sup>1</sup> Estimated from given data. <sup>2</sup> On-chip decimation filter. <sup>3</sup> Off-chip decimation filter. <sup>4</sup> ADC only, <sup>5</sup> Above 200Hz. Fig. 9. Energy-area efficiency comparison with prior art. among the wideband front-ends, which results in channel noise and power efficiency factors (NEF/PEF) [7] slightly below 1. The Walden figure of merit (FoM<sub>W</sub>) for the ADC and the channel align with the other energy-efficient architectures. The combined energy and area efficiency improves the E-A FoM (Area×FoM<sub>W</sub>) [4] by three times over the state of the art. In other words, the energy-area cost per sample (power×area/sampling rate) is 12 times lower than that of other similar resolution front-ends as visualized in Fig. 9. ## IV. CONCLUSION Scaling the area of AC-coupled front-ends has been challenging but essential for high-density, wideband neural signal recording with advanced on-chip processing capabilities. In this paper, we have presented such front-end based on an area-efficient ULC-based asynchronous SAR ADC and a noise-efficient inverter-based CC-LNA. The prototype fabricated in 65nm CMOS occupies 6560 $\mu m^2$ , consumes 0.65 $\mu W$ , and achieves 8.1 ENOB with 3.1 $\mu V_{rms}$ IRN within the LFP+AP band. As a result, the proposed system achieves better noise and energy-area efficiency than previously reported front-ends with 0.97 NEF and 0.78 mm²×fJ/conv-step E-A FoM. The reduction in the unit energy-area cost points towards higher channel counts and more integrated functionality within resource-constrained implants. # V. ACKNOWLEDGMENT We thank Prof. Andreas Burg, Sylvain Hauser, Dr. Jonathan Narinx, Dr. Cosimo Aprile, and Dr. Kerim Ture for their help with the test setup, and Carlotta Gastaldi for taking the micrograph. #### REFERENCES - A. B. Ajiboye *et al.*, "Restoration of reaching and grasping movements through brain-controlled muscle stimulation in a person with tetraplegia: a proof-of-concept demonstration," *Lancet*, vol. 389, no. 10081, pp. 1821–1830, May 2017. - [2] M. Shoaran et al., "A 16-channel 1.1mm<sup>2</sup> implantable seizure control SoC with sub-μW/channel consumption and closed-loop stimulation in 0.18μm CMOS," in 2016 IEEE Symp. VLSI Circuits. IEEE, Jun. 2016, pp. 1–2. - [3] C. Aprile et al., "Adaptive learning-based compressive sampling for low-power wireless implants," *IEEE Trans. Circuits Syst. I Regul. Pap.*, pp. 1–13, 2018. - [4] S. Y. Park et al., "Modular 128-channel Δ ΔΣ analog front-end architecture using spectrum equalization scheme for 1024-channel 3-D neural recording microsystems," *IEEE J. Solid-State Circuits*, vol. 53, no. 2, pp. 501–514, Feb. 2018. - [5] R. R. Harrison *et al.*, "A low-power integrated circuit for a wireless 100-electrode neural recording system," *IEEE J. Solid-State Circuits*, vol. 42, no. 1, pp. 123–133, Jan. 2007. - [6] T. Jochum, T. Denison, and P. Wolf, "Integrated circuit amplifiers for multi-electrode intracortical recording," *J. Neural Eng.*, vol. 6, no. 1, p. 012001, Feb. 2009. - [7] R. Muller, S. Gambini, and J. M. Rabaey, "A 0.013 mm<sup>2</sup>, 5 μW, DC-coupled neural signal acquisition IC with 0.5 V supply," *IEEE J. Solid-State Circuits*, vol. 47, no. 1, pp. 232–243, Jan. 2012. - [8] D. De Dorigo et al., "Fully immersible subcortical neural probes with modular architecture and a delta-sigma ADC integrated under each electrode for parallel readout of 144 recording sites," *IEEE J. Solid-State Circuits*, vol. 53, no. 11, pp. 3111–3125, Nov. 2018. - [9] H. Kassiri et al., "Rail-to-rail-input dual-radio 64-channel closed-loop neurostimulator," *IEEE J. Solid-State Circuits*, vol. 52, no. 11, pp. 2793– 2810, 2017. - [10] J. P. Uehlin et al., "A 0.0023 mm<sup>2</sup>/ch. delta-encoded, time-division multiplexed mixed-signal ECoG recording architecture with stimulus artifact suppression," *IEEE Trans. Biomed. Circuits Syst.*, vol. 14, no. 2, pp. 319–331, Apr. 2020. - [11] P. Harpe, "A compact 10-b SAR ADC with unit-length capacitors and a passive FIR filter," *IEEE J. Solid-State Circuits*, vol. 54, no. 3, pp. 636–645, Mar. 2019. - [12] L. Kull et al., "A 3.1 mW 8b 1.2 GS/s single-channel asynchronous SAR ADC with alternate comparators for enhanced speed in 32 nm digital SOI CMOS," *IEEE J. Solid-State Circuits*, vol. 48, no. 12, pp. 3049–3058, Dec. 2013. - [13] W. Biederman et al., "A 4.78 mm<sup>2</sup> fully-integrated neuromodulation SoC combining 64 acquisition channels with digital compression and simultaneous dual stimulation," *IEEE J. Solid-State Circuits*, vol. 50, no. 4, pp. 1038–1047, Apr. 2015. - [14] J. Huang et al., "A 0.01-mm<sup>2</sup> mostly digital capacitor-less AFE for distributed autonomous neural sensor nodes," *IEEE Solid-State Circuits Lett.*, vol. 1, no. 7, pp. 162–165, Jul. 2018. - [15] S.-J. Kim et al., "A sub-µW/ch analog front-end for δ-neural recording with spike-driven data compression," *IEEE Trans. Biomed. Circuits Syst.*, vol. 13, no. 1, pp. 1–14, Feb. 2019.