#### Project: IEEE P802.15 Working Group for Wireless Personal Area Networks (WPANs)

**Submission Title:** A Preliminary 7nm implementation and communication performance study of SoA FEC classes for Tbps throughputs

Date Submitted: 7 May, 2018 Source: Onur Sahin, InterDigital Europe Address: 64 Great Eastern St, InterDigital Europe, London, UK, EC2A 3QR Voice:+447459205055, FAX:+442077494196, E-Mail:onur.sahin@interdigital.com

#### **Re:** n/a

**Abstract:** This talk will give provide an overview of the state-of-the-art (SoA) high throughput FEC implementation results in 28nm technology. A performance scaling analysis from 28nm to 7nm will be presented. Based on this analysis, 7nm performance extrapolation outcomes of these SoA FEC candidates will be demonstrated. The performance gaps between potential requirements of practical wireless Tbps use-cases and the 7nm performance of SoA high throughput FEC candidates will be shown. We will also provide BER performance comparison of selected LDPC codes and Polar codes.

#### Purpose: Information of IEEE 802.15 IG THz

**Notice:** This document has been prepared to assist the IEEE P802.15. It is offered as a basis for discussion and is not binding on the contributing individual(s) or organization(s). The material in this document is subject to change in form and content after further study. The contributor(s) reserve(s) the right to add, amend or withdraw material contained herein. **Release:** The contributor acknowledges and accepts that this contribution becomes the property of IEEE and may be made publicly available by P802.15.





The EPIC project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 760150. A Preliminary 7nm implementation and communication performance study of SoA FEC classes for **Tbps throughputs** 

Enabling Practical Wireless Tb/s Communications with Next Generation Channel Coding

# **SoA FEC for high throughput wireless systems**

- In existing standards, IEEE 802.11ad, IEEE 802.15.3d, and 3GPP 5G NR present FEC classes with highest throughput values.
- IEEE 802.11ad (Target peak TP: 7 Gbps)
  - Rate (1/2, 5/8, 3/4, 13/16) LDPC with code-word length 672
- IEEE 802.15.3d (Target peak TP: 100 Gbps)
  - Rate 14/15 LDPC (1440,1344)
  - Rate 11/15 LDPC (1440,1056)
- 3GPP 5G NR (Target peak TP: 20 Gbps)
  - Flexible QC-LDPC; 20 Gbps with rate 8/9 is supported.

### 802.11ad FEC 65nm/28nm Implementation Results

| Code | Ref. | Code<br>length | Rate<br>support | Proces<br>s<br>Nm | Area<br>mm² | Freq<br>MHz | TP<br>Gb/s     | Area eff.<br>Gb/s/mm² | Energy<br>eff.<br>pJ/bit | Power dens.<br>W/mm <sup>2</sup> |
|------|------|----------------|-----------------|-------------------|-------------|-------------|----------------|-----------------------|--------------------------|----------------------------------|
| LDPC | [14] | 672            | 0.8125          | 65                | 0.16        | 500         | 5.6            | 35                    | 17.65                    | 0.62                             |
| LDPC | [22] | 672            | 802.11.ad       | 28                | 0.78        | 470         | 18             | 23.6                  | 18                       | 0.41                             |
| LDPC | [23] | 672            | 13/16           | 28                | 2.8         | 220         | 160<br>9 iter. | 57.1<br>9 iter.       | 6<br>9 iter.             | 0.32<br>9 iter.                  |

- [14] M. Li, et al., "An area and energy efficient half row-paralleled layer LDPC decoder for the 802.11ad standard," in Proc. IEEE Workshop on Signal Processing Systems (SiPS'13), Taipei City, Oct. 2013, pp. 112–117.
- [22] M. Li, J. W. Weijers, V. Derudder, I. Vos, M. Rykunov, S. Dupont, P. Debacker, A. Dewilde, Y. Huang, L. V. der Perre, and W. V. Thillo, in Solid-State Circuits Conference (A-SSCC), 2015 IEEE Asian, 2015, pp. 1–5.
- [23] S. Scholl, S. Weithoffer, N. Wehn, "Advanced Iterative Channel Coding Schemes: When Shannon meets Moore", in 9<sup>th</sup> International Symposium on Turbo Codes and Information Processing, pp. 406-411, Invited Talk, 2016, Brest

# Practical Tb/s FEC Implementation KPI Bounds – Broad strokes

- Practical FEC IP area constraint on a SoC: 10 mm<sup>2</sup>
- FEC IP power budget to avoid heat removal issues: 1 W
- FEC decoder throughput: 1 Tbps

| EPIC FEC KPI bounds     |               |  |  |  |  |  |  |
|-------------------------|---------------|--|--|--|--|--|--|
| Area limit              | 10 mm²        |  |  |  |  |  |  |
| Area efficiency limit   | 100 Gb/s/ mm² |  |  |  |  |  |  |
| Energy efficiency limit | ~1 pJ/bit     |  |  |  |  |  |  |
| Power density limit     | 0.1 W/mm²     |  |  |  |  |  |  |

 For detailed analysis, see: D1.2 B5G Wireless Tb/s FEC KPI Requirements and Technology Gap Analysis (available at <u>https://epic-h2020.eu/</u>)

7 May, 2018

### 802.11ad FEC Implementations projected to 7nm

| Code | Ref.                                 | Code<br>length | Rate<br>support | Process<br>nm | Area<br>mm² | Freq<br>MHz  | TP<br>Gb/s | Area eff.<br>Gb/s/mm² | Energy eff.<br>pJ/bit | Power dens.<br>W/mm <sup>2</sup> |
|------|--------------------------------------|----------------|-----------------|---------------|-------------|--------------|------------|-----------------------|-----------------------|----------------------------------|
| LDPC | [14]                                 | 672            | 0.8125          | 7             | 0.003       | 2923<br>1000 | 32<br>11   | 11113<br>3832         | 1.9<br>1.9            | 21.1<br>7.2                      |
| LDPC | [22]                                 | 672            | 802.11ad        | 7             | 0.07        | 1410<br>1000 | 54<br>39   | 830<br>600            | 2.24<br>2.24          | 3.8<br>2.7                       |
| LDPC | [23]<br>9 iterations<br>4 iterations | 672            | 13/16           | 7             | 0.2<br>0.1  | 660.0        | 480<br>480 | 2057<br>4100          | 1.5<br>0.6            | 3.1<br>3.1                       |

- 7nm implementations of [14] and [22] demonstrate lower throughput but also a relatively small chip area (methodology in appendix). Throughput scaling is possible by spatial parallel architectures, however power density is very challenging.
- In, [23], two decoders running in parallel can achieve 1Tb/s @ 4 iterations and very good energy efficiency, but power density and flexibility are very challenging.

### 802.15.3d FEC 7nm Implementation Assesment (1/2)

| Code | Code<br>length             | Rate<br>Suppor<br>t | Proces<br>s<br>Nm | Area<br>mm² | No.<br>cores | Area<br>mm² | Freq<br>MHz | TP<br>Gb/s | Area eff.<br>Gb/s/mm | Energy<br>eff.<br>pJ/bit | Power<br>dens.<br>W/mm² |
|------|----------------------------|---------------------|-------------------|-------------|--------------|-------------|-------------|------------|----------------------|--------------------------|-------------------------|
| LDPC | 1440<br>(Based<br>on [14]) | 11/15               | 7                 | 0.048       | 11           | 0.528       | 1000        | 990        | 1862                 | 2.33                     | 4.3                     |
| LDPC | 1440<br>(Based<br>on [22]) | 14/15               | 7                 | 0.28        | 11           | 0.3         | 1000        | 990        | 3272                 | 1.98                     | 6.5                     |

- Using 802.11ad FEC implementations in [14] and [22], important parameters are extrapolated to LDPC length-1440 Rate=11/15 and Rate=14/15 codes, and scaled to 7nm (methodology in appendix).
- High throughput (~1Tbps) is achievable
- Excessive power density (>>0.1W/mm2): an improvement factor x50 is needed
- Insufficient energy efficiency (>1pJ/bit): an improvement factor x2 is needed

## 802.15.3d FEC 7nm Implementation Assesment (2/2)

- Similarly, using 802.11ad FEC implementations and same architecture in [23], throughput and chip area are estimated based on LDPC length-1440 codes extrapolation and 7nm scaling (methodology in appendix).
- Architecture [23]: 802.11ad with 672 block size
  - Throughput 480 Gb/s
  - Area 0.2 mm<sup>2</sup> with 9 iterations
- Architecture [23]: 802.15.3.d with 1440 block size
  - 1440 block size  $\Rightarrow$  throughput 1028 Gb/s
  - Initial estimate of chip area: ~ 0.4mm<sup>2</sup> with 5 iterations
  - Energy efficieny and power density estimation require further investigation.

## **Performance Gap for a practical Tb/s FEC**

|                      | Constraints                 | LDPC (7nm)<br>CL=672<br>CR=13/16<br>[14] | LDPC (7nm)<br>CL=672<br>CR=13/16<br>[23]                | LDPC (7nm)<br>CL=672<br>CR=802.11ad<br>[22] | LDPC (7nm)<br>CL=1440<br>CR=11/15<br>(based on<br>[14]) | LDPC (7nm)<br>CL=1440<br>CR=14/15<br>(based on<br>[22]) |
|----------------------|-----------------------------|------------------------------------------|---------------------------------------------------------|---------------------------------------------|---------------------------------------------------------|---------------------------------------------------------|
| ТР                   | 1 Tbps                      | 32.7                                     | 480 (9 it.)<br>480 (4 it.)<br>~1 Tb/s for 2<br>decoders | 54.0                                        | 990                                                     | 990                                                     |
| Area<br>efficiency   | 100<br>Gb/s/mm <sup>2</sup> | 11113.1                                  | 2057 (9 it.)<br>4100 (4 it.)                            | 830.8                                       | 1862                                                    | 3272                                                    |
| Power<br>density     | 0.1 W/mm <sup>2</sup>       | 21.1                                     | 3.1 (9 it.)<br>3.1 (4 it.)                              | 3.8                                         | 4.3                                                     | 6.5                                                     |
| Energy<br>efficiency | 1 pJ/bit                    | 1.9                                      | <mark>1.5 (9 it.)</mark><br>0.6 (4 it.)                 | 2.24                                        | 2.33                                                    | 1.98                                                    |

Enabling Practical Wireless Tb/s Communications with Next Generation Channel Coding

EPIC

# Performance Gap for a practical Tb/s FEC: Observations

- Current implementation assessments of 802.15.3d LDPC codes based on SoA architectures demonstrate significant performance gaps in achieving practical Tbps throughputs even with taking 7nm performance scaling into account.
  - Silicon technology evolution to 7nm is expected to provide sufficient gain on area efficiency of SoA high throughput FEC.
  - Power density will emerge as a binding constraint, with an initial estimate of 10x-100x performance gap between the practical requirements and SoA FEC in 7nm.
  - Energy efficiency will also be another constraint that poses performance gap.
  - The clock frequency feasible value of 1 GHz impose additional constraints on Tbps throughputs extreme parallel and unrolled architectures are mandatory.
- <u>Observation:</u> Further implementation study and architecture investigation is necessary to explore feasibility of existing 802.15.3d codes.

# **Communication Performance Evaluation**

- In addition to implementation performance of the FEC, communication performance, e.g. BER/FER, is also critical.
- We provide an initial simulation study to demonstrate the BER performance of LDPC and Polar Codes for ultra-high throughput data rates.

Simulation Assumptions:

- Modulation: QPSK
- AWGN channel (BH/FH use-case in 802.15.3d study)
- FEC classes evaluated in the study:
  - LDPC: Length-1440, Rate=11/15,14/15 ([Fricke et.al.]) 802.15.3d codes
  - Polar codes: Length(L) = 1024, 2048, 32768, Rate=11/15,14/15, Listsize=1,2,4,8,16,32. Density (D) evolution based code design.

[Fricke et.al.] A. Fricke, B. Peng, T. Kürner, Preliminary Performance of FEC Schemes in TG3d Channels, IEEE P802.15 Working Group for WPANs,,doc.: IEEE 802.15-16-0746-01-003d, 09.01.2017

#### Rate 11/15 LDPC L=1440 vs Polar codes L=1024

- Polar codes with L=1 performance is within 1dB of LDPC coding gain for BER>10<sup>-5</sup>. For BER<10<sup>-5</sup>, the trend/slope of the curves shows diminishing performance gap.
- Polar codes with L=8 have similar performance with LDPC for BER>10<sup>-5</sup> and shows better coding gain for BER<10<sup>-5</sup>.
- An investigation to demonstrate complexity and latency of various options of Polar codes and LDPC for Tbps throughputs, e.g. list sizes, block-lengths, and their comparison of communication performances is necessary.



### Rate 14/15 LDPC L=1440 vs Polar codes L=1024

- Polar codes with L=1 performance is within 1dB of LDPC coding gain for the BER range. For BER<10<sup>-5</sup>, the trend/slope of the curves shows diminishing performance gap.
- Polar codes with L=16 have similar performance with LDPC for BER>10<sup>-5</sup>.
- <u>Observation</u>: Further study is necessary to investigate performance comparison of different FEC classes, particularly in the lower BER regime, e.g. BER<10<sup>-5</sup>, and considering the known error-floor characteristic of LDPC codes.



## Conclusion

- Initial implementation studies demonstrate that SoA FEC and architectures, including 802.15.3d codes, fall short of achieving a practical Tbps throughput and other important KPI targets.
- Further implementation study and investigation of various architectures is necessary to explore feasibility of existing 802.15.3d codes and other FEC classes.
- An initial communications performance study of Polar codes with different options, e.g. list size, block-lengths, demonstrate competitive performance wrt 802.15.3d codes.
- Further study is necessary to investigate communications performance comparison of various FEC classes in Tbps domain, particularly in the lower BER regime.



# Appendix

7 May, 2018

# **Projection Methodology: Scaling Formulas**

| Parameter name    | Parameter in SoA<br>technology node | Scaling factor           | Scaled parameter                  |  |  |
|-------------------|-------------------------------------|--------------------------|-----------------------------------|--|--|
| Throughput        | Т                                   | $S_F$                    | $T \cdot S_F$                     |  |  |
| Clock Frequency   | F                                   | $S_F$                    | $F \cdot S_F$                     |  |  |
| Area              | А                                   | S <sub>A</sub>           | $A \cdot S_A^n$                   |  |  |
| Power             | $P = EE \cdot T$                    | $S_{EE} \cdot S_F$       | $P \cdot S_{EE} \cdot S_F$        |  |  |
| Area efficiency   | AE = T/A                            | $S_F/S_A$                | $AE \cdot S_F/S_A$                |  |  |
| Energy efficiency | EE                                  | S <sub>EE</sub>          | $EE \cdot S_{EE}$                 |  |  |
| Power density     | PD = P/A                            | $S_{EE} \cdot S_F / S_A$ | $PD \cdot S_{EE} \cdot S_F / S_A$ |  |  |

# 28nm to 7nm Implementation KPI Projection

Based on ITRS roadmap [ITRS2015] and NVIDIA analysis [Villa2014], moving from 28nm to 7nm will bring *approximately:* 

- x12 factor reduction in area
- x4 factor improvement in energy efficiency
- x3 increase in clock speed (theoretical maximum operating frequency), however we limit the maximum frequency to 1 GHz that is a feasible frequency for a SoC IP.

[ITRS2015] ITRS 2.0, International Technology Roadmap for Semiconductors, 2015 Edition, Section 5: More Moore. [Villa2014] O. Villa et al, Scaling the Power Wall: A path to Exascale", International Conference for High Performance Computing, Networking, Storage and Analysis, Nov. 2014).

## 802.15.3d 28nm Implementation Assesments

| Code                       | Code<br>length | Rate<br>support | Process<br>Nm | Area<br>mm² | Freq<br>MHz | TP<br>Gb/s<br>(input) | Area eff.<br>Gb/s/mm² | Energy eff.<br>pJ/bit | Power *<br>dens.<br>W/mm² |
|----------------------------|----------------|-----------------|---------------|-------------|-------------|-----------------------|-----------------------|-----------------------|---------------------------|
| LDPC<br>(Based<br>on [14]) | 1440           | 11/15           | 28            | 0.55        | 600         | 54                    | 93                    | 9.3                   | 0.87                      |
| LDPC<br>(Based<br>on [22]) | 1440           | 14/15           | 28            | 0.31        | 500         | 45                    | 136                   | 7.9                   | 1.1                       |

• Throughput @ 4 iterations layer decoding

\*Peak power

#### Rate 11/15 LDPC 1440 vs Polar codes 2048



Enabling Practical Wireless Tb/s Communications with Next Generation Channel Coding

EPIC

#### Rate 14/15 LDPC 1440 vs Polar codes 2048



Enabling Practical Wireless Tb/s Communications with Next Generation Channel Coding

EPIC

# Rate 14/15 LDPC 1440 vs Polar codes {1024,2048,32768}



#### EPIC

#### **EPIC Grant Agreement No. 760150**

"The EPIC project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 760150."

If you need further information, please contact the coordinator: TECHNIKON Forschungs- und Planungsgesellschaft mbH Burgplatz 3a, 9500 Villach, AUSTRIA Tel: +43 4242 233 55 Fax: +43 4242 233 55 77 E-Mail: coordination@epic-h2020.eu

The information in this document is provided "as is", and no guarantee or warranty is given that the information is fit for any particular purpose. The content of this document reflects only the author's view – the European Commission is not responsible for any use that may be made of the information it contains. The users use the information at their sole risk and liability.

Enabling Practical Wireless Tb/s Communications with Next Generation Channel Coding

Enabling Practical Wireless Tb/s Communications with Next Generation Channel Coding