Anders Nilsson

Who am I? And what is VectorDSP.se?

I'm a passionate DSP processor architect who like to understand how things work and challenge "old truths", no challenge is too big!

My journey started off as a genuine hobby interest in electronics, then a PhD in computer architecture followed by a spin-off (Coresonic AB) and then acquisition by of one of the largest semiconductor companies in the consumer wireless business (MediaTek).

The idea is to use this web site to share interesting topics related to the field of high performance digital signal processors, regardless if the application is baseband processing, machine learning or computer graphics.

Anders Nilsson

Brief facts about me

Name: Anders Nilsson
Current employer:MediaTek (acquired Coresonic)
Co-founder and CTO of Coresonic AB Dec 2004 -
Previous employers:Coresonic AB
ISY Datorteknik, Computer Engineering group at Linkoping University
Sectra Wireless Technologies AB
PhD defense: 8e June 2007
Publications:See list here.
Patents:20+. Most of the patents relate to Vector DSP architecture and communication systems such as NR (5G) or LTE.

The PhD defense of Anders Nilsson: Design of programmable multi-standard baseband processors

The public defense of my PhD thesis was held at Friday June 8 2007. My opponent was Professor Dr. Gerhard Fettweis, TU Dresden, Dresden, Germany.

The thesis can be downloaded here. The signed cover is available here.


Summary of my PhD Thesis

Efficient programmable baseband processors are important to enable true multi-standard radio platforms as convergence of mobile communication devices and systems requires multi-standard processing devices. The processors do not only need the capability to handle differences in a single standard, often there is a great need to cover several completely different modulation methods such as OFDM and CDMA with the same processing device. Programmability can also be used to quickly adapt to new and updated standards within the ever changing wireless communication industry since a pure ASIC solution will not be flexible enough. ASIC solutions for multi-standard baseband processing are also less area efficient than their programmable counterparts since processing resources cannot be efficiently shared between different operations. However, as baseband processing is computationally demanding, traditional DSP architectures cannot be used due to their limited computing capacity. Instead VLIW- and SIMD-based processors are used to provide sufficient computing capacity for baseband applications. The drawback of VLIW-based DSPs is their low power efficiency due to the wide instructions that need to be fetched every clock cycle and their control-path overhead. On the other hand, pure SIMD-based DSPs lack the possibility to perform different concurrent operations. Since memory access power is the dominating part of the power consumption in a processor, other alternatives should be investigated.

In this dissertation a new and unique type of processor architecture has been designed that instead of using the traditional architectures has started from the application requirements with efficiency in mind. The architecture is named ``Single Instruction stream Multiple Tasks'', SIMT in short. The SIMT architecture uses the vector nature of most baseband programs to provide a good trade-off between the flexibility of a VLIW processor and the processing efficiency of a SIMD processor. The contributions of this project are the design and research of key architectural components in the SIMT architecture as well as development of design methodologies. Methodologies for accelerator selection are also presented. Furthermore data dependency control and memory management are studied. Architecture and performance characteristics have also been compared between the SIMT and more traditional processor architectures.

A complete system is demonstrated by the BBP2 baseband processor that has been designed using SIMT technology. The SIMT principle has previously been proven in a small scale in silicon in the BBP1 processor implementing a Wireless LAN transceiver. The second demonstrator chip (BBP2) was manufactured early 2007 and implements a full scale system with multiple SIMD clusters and a controller core supporting multiple threads. It includes enough memory to run symbol processing of DVB-H/T, WiMAX, IEEE 802.11a/b/g and WCDMA, and the silicon area is 11 mm2 in a 0.12 um CMOS technology.

Layout plot of the fabricated processor (Presented at ISSCC 2008)

Layout plot of the BBP2 processor

For more detail about the processor and it's architecture, see my PhD thesis here.