# Designing Future VLSI Systems with Monolithically Integrated Silicon-Photonics

### Vladimir Stojanović University of California, Berkeley



SSCS DL Lecture University of Texas, Austin November, 2013

## Acknowledgments

- Milos Popović (Boulder), Rajeev Ram, Michael Watts, Hanqing Li (MIT), Krste Asanović (UC Berkeley)
- Jason Orcutt, Jeffrey Shainline, Christopher Batten, Ajay Joshi, Anatoly Khilo
- Mark Wade, Karan Mehta, Erman Timurdogan, Jie Sun, Cheryl Sorace, Josh Wang
- Michael Georgas, Jonathan Leu, Benjamin Moss, Chen Sun, Yu-Hsin Chen
- Yong-Jin Kwon, Scott Beamer, Yunsup Lee, Andrew Waterman, Miquel Planas
- Roy Meade, Gurtej Sandhu and Fab12 team (Zvi, Ofer, Daniel, Efi, Elad, ...)
- DARPA, Micron, NSF and FCRP IFC
- IBM Trusted Foundry, CNSE Albany, Solid-State Circuits Society

## Chip design is going through a change

- Already have more devices than can use at once
- Limited by power density and bandwidth









Oracle T5 Nvidia Fermi 16 cores, 128 Threads 540 CUDA cores

IBM Power 7 8 cores, 32 threads

Intel Knights Corner 50 cores, 200 Threads





"The Processor is the new Transistor" [Rowen]

#### Bandwidth, pin count and power scaling



#### Memory interface scaling problems: Energy-cost and bandwidth density



Energy cost [p]/bit]

#### Power and pins required for 10TFlop/s



# Monolithic Si-Photonics for core-to-core and core-to-DRAM networks



Bandwidth density – need dense WDM Energy-efficiency – need monolithic integration

#### **Monolithic CMOS photonic integration**



Thin BOX SOI CMOS Electronics

**Bulk CMOS Electronics** 

#### Si and polySi waveguide formation





#### Integrated photonic interconnects



#### Single channel link tradeoffs



#### **Resonance sensitivity**



- Process and temperature shift resonances
- Direct thermal tuning cost prohibitive Georgas CICC 2011, Sun NOCS 2012

#### Smarter wavelength tuning



Tuning Power [mW]

## Need to optimize carefully



- Laser energy increases with data-rate
  - -Limited Rx sensitivity

-Modulation more expensive -> lower extinction ratio

- Tuning costs decrease with data-rate
- Moderate data rates most energy-efficient

Georgas CICC 2011

assuming 32nm CMOS

Laser

Buffer

lock

## **DWDM link efficiency optimization**



#### **Optimize for min energy-cost**

# Bandwidth density dominated by circuit and photonics area (not coupler pitch)

- 10x better than electrical bump limited
- 200x better than electrical package pin limit

#### Many architectural studies show promise



[Koka'08-10]





[Vantrease'08] [Psota'07] [Kirman'06]



[Batten'08]



[Beamer'10]

# Photonic memory interface – leveraging optical bandwidth density



#### Important Concepts

- Power/message switching (only to active DRAM chip in DRAM cube/super DIMM)

- Vertical die-to-die coupling (minimizes cabling - 8 dies per DRAM cube)

- -Command distributed electrically (broadcast)
- Data photonic (single writer multiple readers)

Enables energy-efficient throughput and capacity scaling per memory channel

#### Beamer ISCA 2010

#### **Laser Power Guiding Effectiveness**



Enables capacity scaling per channel and significant savings in laser energy Beamer ISCA 2010

## **Optimizing DRAM with photonics**



### **Design Space Exploration of Networks Tool**

DSENT – A Tool Connecting Emerging Photonics with Electronics for Opto-Electronic Networks Modeling



#### Significant integration activity, but hybrid and older processes ...



#### [Luxtera/Oracle/Kotura]



[Intel]



[IBM]



[Many schools]

[Watts/Sandia/MIT] [Lipson/Cornell]

[Kimerling/MIT]



[HP]





#### **Our work: Si Electronic-Photonic Integration Timeline**



#### **EOS Platform: EOS8 fabricated in IBM12SOI**



Orcutt et al, Optics Express, 2012

3 x 3 mm die

45nm Thin Box SOI Technology (used for Power 7 and Cell processors)

**3M Transistors** 

400 Pads

ARM Standard Cells and custom link circuits

#### **EOS8 performance summary**



Fiber-to-chip grating couplers with 3.5 dB insertion loss

Waveguides under 4dB/cm propagation loss

10 dB extinction optical modulators

8 channel wavelength division multiplexing filter bank with <-20 dB cross talk

All integrated with electronic circuits

### Integration of photonics into VLSI tools



### **Circuit/Device Co-Simulation: VerilogA**



#### **Platform Organization**



#### **Chips fully packaged**







# Best waveguide losses ever reported in a sub-100nm production CMOS line

- Body-Si waveguides
  3-4dB/cm loss
- Poly waveguides
  50dB/cm loss

- Body-Si ring Q factor
  - 227k @ 1280nm
  - 112k @ 1550nm



#### **Exceptional dimensional control in 45nm node**



- 8-wavelength filterbank results
  - Filter channels fabricated in order
  - Less than 1nm variation
- Excellent channel isolation (>20dB at 250GHz spacing) 30

### Integrated thermal tuning circuits



10mW required to retune all 8 rings

Wavelength (nm)

1270

1265

Relative Transmission (dB)

Negligible overhead of tuning circuits (thermal BW < 500kHz)</li>

1275

- Tuning efficiency 130uW/K (32.4mW/2 $\pi$ ) - fully substrate released chips

#### Low-power current-sensing optical receiver



Georgas ESSCIRC 2011, JSSC 2012

## **Optical modulator design**

2

-16

-20

 $10 \mu m$ 

1534.0

1534.4

1534.8 Wavelength [nm]

#### Shainline, Popovic



10µm

1535.2

1535.6

- Extinction ratio 19dB
- 45GHz 3dB optical bw ullet

#### at 1280nm

- Extinction ratio 9dB
- 60GHz 3dB optical bw

#### **Optical modulator – electrical tests**



- Carrier-lifetime 2-3ns -> 200MHz electrical bandwidth
  - Diffusion time constant affected by
    - Recombination time
    - Drift conditions

#### Modulator driver sub-bit pre-emphasis



• Partial forward bias at 0-bit key to fast operation

#### **Modulator driver heads**



- Split-supply used for sub-bit pre-emphasis
  - Use core and I/O voltage no regulators

#### First modulation in 45nm process



- 2.5Gb/s modulation
- 1.2pJ/bit
- 3dB insertion loss
- 3dB extinction ratio

#### Moss ISSCC 2013



#### **Depletion modulators in 45nm SOI CMOS**



Shainline et al., Optics Letters 2013

## **Depletion modulators in 45nm SOI CMOS**



- Modulation:
  - 5 Gbps
  - 5.2dB extinction ratio
- Energy:
  - 55 fJ/bit
  - Tunable across FSR with 400GHz/mW (~2nm/mW)



#### Memory interface scaling problems: **Energy-cost and bandwidth density**



Energy cost [p]/bit]

#### Power and pins required for 10TFlop/s



## DRAM side: Bulk integration (polySi photonics)



## Summary

- Silicon-photonics can push both critical dimensions
  - Energy-efficiency monolithic integration
  - Bandwidth Density dense WDM
- Need to optimize across layers
  - Connect devices to circuits, and links to networks
- Building early technology development platforms
  - Feedback to device and circuit designers
  - Accelerated adoption
- EOS Platform designed for multi-project wafer runs
  - Best end-of-line passives in sub-100nm process (3-4dB/cm loss)
  - sub-100fJ/b transmitters/receivers
  - Record-high tuning efficiency with undercut ~ 25uW/K

## Conclusions

- Silicon-photonics enabler of new capabilities
  - Think "new on-chip inductor" or "new on-chip t-line"
- Potentially revolutionize many applications despite slowdown in CMOS scaling
  - VLSI compute and network infrastructure
  - Wireless comm
  - Imaging and Sensing
- Need process, device, circuit and system-level understanding
- So, jump-in and ride the "new wave"

