...MmWave MassiveMIMOBased Wireless Backhaul for 5GUltra-...
Stefano+Buzzi+and+Carmen+DAndrea
Enhanced mobile broadband (eMBB) is one of the key use?cases for the development of the new standard 5G New Radio for the next generation of mobile wireless networks. Large?scale antenna arrays, a.k.a. massive multiple?input multiple?output (MIMO), the usage of carrier frequencies in the range 10-100 GHz, the so?called millimeter wave (mm?Wave) band, and the network densification with the introduction of small?sized cells are the three technologies that will permit implementing eMBB services and realizing the Gbit/s mobile wireless experience. This paper is focused on the massive MIMO technology. Initially conceived for conventional cellular frequencies in the sub?6 GHz range (μ?Wave), the massive MIMO concept has been then progressively extended to the case in which mm?Wave frequencies are used. However, due to different propagation mechanisms in urban scenarios, the resulting MIMO channel models at μ?Wave and mm?Wave are radically different. Six key basic differences are pinpointed in this paper, along with the implications that they have on the architecture and algorithms of the communication transceivers and on the attainable performance in terms of reliability and multiplexing capabilities.
millimeter wave; microwave; channel modeling; massive MIMO; doubly massive MIMO
1 Introduction
ifth?generation (5G) wireless networks are expected to provide 1000x improvement on the supported data rate, as compared to current LTE networks. Such an improvement will be mainly achieved through the concurrent use of three factors [1]: (a) the reduction in the size of the radio?cells, so that a larger data?rate density can be achieved; (b) the use of large?scale antenna arrays at the base stations (BSs), i.e., massive multiple?input multiple?output (MIMO) [2], so that several users can be multiplexed in the same time?frequency resource slot through multiuser MIMO (MU?MIMO) techniques; and (c) the use of carrier frequencies in the range 10 GHz-100 GHz, a.k.a. millimeter?waves (mm?Waves) [3], so that larger bandwidths become available. The factor (a), i.e. the densification of the network, is actually a trend that we have been observing for some decades, in the sense that the size of the radio cells has been progressively reduced over time from one generation of cellular networks to the next one. Differently, factor (b) can be seen as a sort of 4.5G technology, in the sense that the latest 3GPP LTE releases already include the possibility to equip BS with antenna arrays of up to 64 elements. This trend will certainly continue in the future 5G New Radio standard, since the potentialities of massive MIMO are currently being tested worldwide in a number of real?world experiments (for instance, [4] and [5]). The use of mm?Waves, on the contrary, is a more recent technology, at least as far as wireless cellular applications are concerned, and, although there is no doubt that future cellular networks will rely on these technologies, mm?Waves can be certainly classified as a true 5G technology.
Focusing on the massive MIMO technology, most of the research and experimental work has mainly considered its use at conventional cellular frequencies (e.g. sub?6 GHz). We denote here such a range of frequencies as μ?Wave, to contrast them with the above?6 GHz frequencies that we denote as mm?Wave1. Only recently, the combination of the massive MIMO concept with the use of mm?Wave frequency bands has started being considered [6], [7]. As a matter of fact, the channel propagation mechanisms at μ?Wave frequencies are completely different from those at mm?Waves. As an instance, the rich?scattering environment at μ?Wave in urban environments is observed [8], thus implying that the MIMO channel is customarily modeled as the product of a scalar constant when the shadowing effects and path loss times a matrix with independent and identically distributed (i.i.d.) entries are taken into account. At mm?Waves, instead, propagation is mainly based on Line?of?Sight (LOS) propagation and on one?hop reflections, and blockage phenomena are more frequent. To capture these mechanisms, a finite?rank clustered channel model is usually employed [9]-[11]. This paper compares massive MIMO systems at μ?Waves with massive MIMO systems at mm?Waves. We observe that these two different channel models have key implications on the achievable performance, on the multiplexing capabilities of the channels themselves, on the beamforming strategies that can be employed, on the transceiver algorithms and on the adopted channel estimation procedures. Six key differences between massive MIMO systems at μ?Waves and massive MIMO systems at mm?Waves are thus identified and critically discussed.
The rest of this paper is organized as follows. Section 2 describes the considered transceiver model and the massive MIMO channel models at μ?Waves and at mm?Wave frequencies. Section 3, the core of the paper, is divided in six subsections, each one describing a key difference between the massive MIMO channels at μ?Wave and at mm?Wave frequencies; numerical results are also shown here in order to provide experimental evidence of the theoretical discussion. Finally, concluding remarks are given in Section 4.
2 System and Channel Models
In this section, we briefly illustrate the considered transceiver architecture and review the main characteristics of the MIMO wireless channel at μ?Wave and mm?Wave carrier frequencies.
We consider a MIMO wireless link with NT antennas at the transmitter and NR antennas at the receiver. We denote by d the distance between the transmitter and receiver, and by M the number of transmitted parallel data streams (i.e., the multiplexing order). The considered transceiver model is shown in Fig. 1.
2.1 μWave Channel Model
Assuming frequency?flat fading (i.e. either multipath may be neglected or it is nulled through the use of OFDM modulation), at channel frequencies below 6 GHz, the propagation channel is customarily modelled through an ?dimensional matrix, whose entry, , has the following structure [12], [13]:
, (1)
where represents the small?scale (fast) fading between the receive antenna and the transmit antenna, and represents the (slow) large?scale fading (shadowing) and the path?loss between the transmitter and the receiver. In a rich scattering environment, the coefficientsare i.i.d. CN (0,1) random variables. The factoris assumed constant across the transmit and receive antennas (i.e., it does not depend on the indices ), and is usually expressed as:
, (2)
where represents the path loss and represents the shadow fading with the standard deviation and With regard to the path loss , several models have been derived over the years, based on theoretical models and/or on empirical heuristics. According to the popular three?slope model [13], [14], the path loss in logarithmic units is given by:
, (3)
where
(4)
with the carrier frequency in MHz, the transmitter antenna height in meters, and the receiver antenna height in meters. Given the fact that the small?scale fading contribution to the entries of the matrix are i.i.d random variates, the channel matrix has full?rank with probability 1, and its rank is equal to the minimum value between and .
2.2 mmWave Channel Model
At mm?Waves, propagation mechanisms are different from those at μ?Waves. Indeed, path loss is much larger, while diffraction effects are practically negligible, thus implying that the typical range in cellular environments is usually not larger than 100 m, and the non?LOS component is mainly based on reflections. Moreover, signal blockages, due to the presence of macroscopic obstacles between the transmitter and the receiver, are much more frequent than those at μ?Wave frequencies. In order to catch these peculiarities, general consensus has been reached on the so?called clustered channel model [7], [15]-[18]. This model is based on the assumption that the propagation environment is made ofscattering clusters, each of which contributes withpropagation paths, plus a possibly present LOS component. Apart from the LOS component, the transmitter and the receiver are linked through single reflections on the scattering clusters. Assuming again frequency?flat fading and focusing on a bi?dimensional model for the sake of simplicity, the baseband equivalent of the propagation channel is now represented by an ?dimensional matrix expressed as:
. (5)
In the above equation, we denote by and the angles of arrival and departure of the ray in the scattering cluster, respectively. The quantities and are the complex path gain and the attenuation associated to the propagation path. Following [10], the attenuationof the path is written in logarithmic units as:
, (6)
with the wavelength, the speed of light, the path loss exponent, the zero?mean, -variance Gaussian?distributed shadow fading term in logarithmic units, a system parameter, and a fixed reference frequency, the centroid of all the frequencies represented by the path loss model. The values for all these parameters for the four?different use?case scenarios discussed in [10] (Urban Microcellular (UMi) Open?Square, UMi Street?Canyon, Indoor Hotspot (InH) Office, and InH Shopping Mall) are reported in Table 1.
The complex gain , with [15]. The factors and represent the normalized receive and transmit array response vectors evaluated at the corresponding angles of arrival and departure; for an uniform linear array (ULA) with half?wavelength inter?element spacing we have . A similar expression can be also given for . Finally,
is a normalization factor that ensures the received signal power scales linearly with the product . Regarding the LOS component, the arrival and departure angles corresponding to the LOS link are denoted by and , and we assume that
. (7)
In the above equation, and is a random variate indicating the existence of a LOS link between transmitter and receiver. A detailed description of all the parameters needed for the generation of sample realizations for the channel model in (5) is reported in [9]. Comparing the channel model in (5) for mm?Wave frequencies with the one in (1) for μ?Wave frequencies, it is immediately evident that the channel in (5) is a parametric channel model whose rank is tied to the number of clusters and reflectors contributing to the transmitter?receiver link. The next section will provide an accurate description of the implications that these two radically different channel models have on the architecture and on the attainable performance of massive MIMO multiuser wireless systems operating at μ?Wave and at mm?Wave frequencies.
3 mm?Wave vs. μ ?Wave Massive MIMO
In the following, we highlight and discuss six key differences between μ?Wave and mm?Wave massive MIMO systems.
3.1 Doubly Massive MIMO at mmWaves
The idea of a large scale antenna array was originally launched by Marzetta in his pioneering paper [12] with reference to BSs. The paper showed that in the limit of a large number of base station antennas small?scale fading effects vanish by virtue of channel hardening, and that channel vectors from the BS to the users tend to become orthogonal; consequently, plain channel?matched beamforming at the BS permits serving several users on the same time?frequency resource slot with (ideally) no interference, and the only left impairment is imperfect channel estimates due to the fact that orthogonal pilots are limited and they must be re?used throughout the network (this is the so?called pilot contamination effect, discussed in the following). Reference [12] considered a system where mobile users were equipped with just one antenna. Successive studies have extended the massive MIMO idea at μ?Wave frequencies to the case in which the mobile devices have multiple antennas, but this number is obviously limited to few units. Indeed, at μ?Wave frequencies the wavelength is in the order of several centimeters, and it is thus difficult to pack many antennas on small?sized user devices. At μ?Waves, thus, massive MIMO just refers to BSs. Things are instead different at mm?Waves, wherein multiple antennas are necessary first and foremost to compensate for the increased path loss with respect to conventional sub-6 GHz frequencies. At mm?Waves, the wavelength is on the order of millimeters, and, at least in principle, a large number of antennas can be mounted not only on the BS, but also on the user device. As an example, at a carrier frequency of 30 GHz the wavelength is 1 cm, and for a planar antenna array with λ/2 spacing, more than 180 antennas can be placed in an area as large as a standard credit card (8.5 cm x 5.5 cm); this number climbs up to 1300 at a carrier frequency of 80 GHz. This consideration leads to the concept of doubly massive MIMO system [7], which is defined as a wireless communication system where the number of antennas grows large at both the transmitter and the receiver. Of course, there are a number of serious practical constraints—e.g., large power consumption, low efficiency of power amplifiers, hardware complexity, ADC and beamformer implementation—that currently prevent the feasibility of a user terminal equipped with hundreds of antennas. Mobile devices with a massive number of antennas thus will not be available in a few years, but, given the intense pace of technological progress, sooner or later they will become reality. As far as long?term forward?looking theoretical research is concerned, we believe that doubly?massive MIMO systems at mm?Waves will be a popular research topic for years to come.
3.2 Analog (BeamSteering) Beamforming Optimal
One problem with massive MIMO systems is the cost and the complexity of needed hardware to efficiently exploit a so large number of antennas. If fully digital beamforming is to be made, as many RF chains will be needed as the number of antennas; consequently, energy consumption will also grow linearly with the number of antennas. In order to circumvent this problem, lower complexity architectures have been proposed, encompassing, for instance, 1?bit quantization of the antenna outputs [19] and hybrid analog/digital beamforming structures [11], [18], [20], wherein an RF beamforming matrix (whose entries operate as simple phase shifters) is cascaded to a reduced?size digital beamformer. The authors of the paper [21] has shown that if the number of RF chains is twice the multiplexing order, the hybrid beamformer is capable of implementing any fully digital beamformer. Now, while at μ?Waves the use of hybrid beamformer brings an unavoidable performance degradation, at mm?Waves something different happens in the limiting regime of large number of antennas by virtue of the different propagation mechanisms. Indeed, the channel matrix in (5) can be compactly re?written as:
, (8)
where we lump the coefficients into the path?loss term, and group the two summations over the clusters and the rays in just one summation, with being the number of propagation paths from the transmitter to the receiver. Given the continuous random location of the scatterers, the set of arrival angles will be different with probability 1, i.e. there is a zero probability that two distinct scatterers will contribute to the channel with the same departure and arrival angles. Since, for a large number of antennas, we have , provided that , we can conclude that for large, the vectors for all converge to an orthogonal set, and, similarly, for large , the vectors for all converge to an orthogonal set as well. Accordingly, in the doubly massive MIMO regime, the array response vectors and become the left and right singular vectors of the channel matrix, i.e. the channel representation (8) coincides with the singular?value?decomposition of the channel matrix. Under this situation, purely analog (beam?steering) beamforming becomes optimal. Otherwise stated, we have two main consequences. First, in a single?user link, the channel eigendirections associated to the largest eigenvalues are just the beam?steering vectors corresponding to the arrival and departure angles and associated with the predominant scatterers. This suggests that pre?coding and post?coding beamforming simply require pointing a beam towards the predominant scatterer at the transmitter and at the receiver respectively. Second, in a multiuser environment, assuming that the links between the several users and the BS involve separate scatterers and different sets of arrival and departure angles2, beam?steering analog beamforming automatically results in no?cochannel interference (in the limiting regime of infinite number of antennas) since the beams pointed towards different users tend to become orthogonal. Fig. 2 provides some experimental evidence of the above statements. We have considered a single?user MIMO link at mm?Waves; the carrier frequency is 73 GHz, the transmitting antenna height is 15 m, while the receiving antenna height is 1.65 m. All the parameters needed for the generation of the mm?Wave channel matrix in (5) are the ones reported in [9] for the “open square model”. Fig. 2 shows the system spectral efficiency measured in bit/s/Hz, versus the received signal to noise ratio (SNR), and it compares the performances of the channel matched (CM) fully digital beamforming and the analog (AN) beam?steering beamforming. With CM beamforming the pre?coding and post?coding beamformers are the left and singular eigenvectors of the channel matrix in (5) associated to the largest eigenvalues respectively; with AN beamforming, instead, the pre?coding and post?coding beamformers are simply the array responses corresponding to the departure and arrival angles associated to the dominant scatterers respectively. From the figure it is seen that AN beamforming achieves practically the same performance as CM beamforming for multiplexing order , even in the case of not?so?large number of antennas, while there is a small gap for ; this gap is supposed to get reduced as the number of antennas increases.
3.3 Rank of the Channel Not Increasing with and
At μ?Wave frequencies, the i.i.d. assumption for the small?scale fading component of the channel matrix guarantees that with probability 1 the matrix has rank equal to min . Consequently, as long as the rich?scattering environment assumption holds and the number of degrees of freedom of the radiated and scattered fields is sufficiently high [22], the matrix rank increases linearly with the number of antennas. At mm?Wave frequencies, instead, the validity of the channel model in (5) directly implies that, including the LOS component, the channel has at most the rank , since it is expressed as the sum of rank?1 matrices. This rank is clearly independent of the number of transmit and receive antennas, so, mathematically, as long as min , increasing the number of antennas has no effect on the channel rank. However, it is also suggested that, for increasing number of antennas, the directive beams become narrower and narrower and more scatterers can be resolved, which implies that the channel rank increases (even though probably not linearly) with the number of antennas. However, this is still a conjecture that would need experimental validation.
With respect to the number of antennas, the described different behavior of the channel rank has a profound impact on the multiplexing capabilities of the channel. Indeed, for μ?Wave systems, the increase in the channel rank leads to an increase of the multiplexing capabilities of the channel; on the other hand, the multiplexing capabilities depend on the number of scatterers in the propagation environment in mm?Wave systems, while the number of antennas just contributes to the increase of the received power that can increase proportionally to the product . Fig. 3 provides experimental evidence of such a different behavior. The figure shows the system spectral efficiency for mm?Wave and μ?Wave wireless MIMO links, for two different values of the number of receive and transmit antennas, and for three different values of the multiplexing order . The parameters of the mm?Wave channel are the same as those in Fig. 2. Regarding the μ?Wave channel, a carrier frequency equal to 1.9 GHz is considered and the standard deviation of the shadow fading is taken equal to 8 dB, while the parameters of the three?slope path loss model in (3) are m and m. It is clearly seen from Fig. 3 that the μ?Wave channel has larger multiplexing capabilities than the mm?Wave channel; the gap between the two scenarios is mostly emphasized for the large values of and for .
3.4 Channel Estimation Simpler
In μ?Wave massive MIMO systems, channel estimation is a rather difficult and resource?consuming task, since it requires the separate estimation of each entry of the matrix . It thus follows that in a multiuser system with users equipped with antennas each, the number of parameters to be estimated is , where denotes the number of antennas at the BS. The attendant computational complexity needed to perform channel estimation is a growing function of the number of used antennas. Additionally, the increase of the number of antennas at the mobile devices has a direct impact on the network capacity. Indeed, let denote the duration (in discrete samples) of the channel coherence time and the length (again in discrete samples) of the pilot sequences used on the uplink for channel estimation; since the length of pilot sequences must be a fraction (typically no more than 1/2) of the channel coherence length, and since the use of orthogonal pilots across users requires that , it is readily seen that we have a physical bound on the maximum number of users and the number of transceiver antennas at the mobile device. Such a bound is the main underlying motivation for the fact that a considerable share of the available literature on massive MIMO systems at μ?Waves focuses on the case of single?antenna mobile devices, and with , the number of users can be taken larger. Additionally, to increase the number of supported users, pseudo?orthogonal pilots with low cross?correlation are used, even though this leads to the well?known pilot contamination problem that, as discussed in the sequel, is the ultimate performance limit in μ?Waves massive MIMO systems [12].
At mm?Wave frequencies, instead, the clustered channel model of (5) is basically a parametric model, and the number of parameters is essentially independent of the number of antennas. Based on this consideration, the computational complexity of the channel estimation schemes at mm?Waves may be smaller than that at μ?Waves. Channel estimation for mm?Wave frequencies is a research track that is currently under development, whereas for μ?Waves this is a rather mature area. Among the several existing approaches to perform channel estimation at mm?Waves, the most considered ones rely either on compressed sensing or on subspace methods. As an example, reference [23] shows that at mm?Waves, for increasing number of antennas, the most significant components of the received signal lie in a low?dimensional subspace due to the limited angular spread of the reflecting clusters. This low?dimensionality feature can be exploited in order to obtain channel estimation algorithms based on the sampling of only a small subset rather than of the whole number of antenna elements. Consequently, channel estimation can be performed using a reduced number (with respect to the number of receive antennas) of required RF chains and A/D converters at receiver front?end. Reference [24], instead, develops subspace?based channel estimation methods exploiting channel reciprocity in TDD systems, using the well?known Arnoldi iteration and explicitly taking into account the adoption of hybrid analog/digital beamforming structures at the transmitter and at the receiver. Subspace methods are particularly attractive in those situations where it is of interest to estimate the principal left and right singular eigenvectors of the channel matrix , which, in the doubly massive MIMO regime, are well?approximated by the array response vectors corresponding to the dominant scatterers. As done in [25], applying fast subspace estimation algorithms such as the Ojas one [26], the dominant channel eigenvectors can be directly obtained by the sample estimate of the data covariance matrix, with no need to directly estimate the whole channel matrix .
Figs. 4 and 5 show numerical results concerning channel estimation at μ?Wave and at mm?Wave channel frequencies. In particular, both figures report the spectral efficiency vs. the received SNR for two different antenna configurations and by contrasting the case of perfect channel state information (CSI) with the case in which the channel is estimated based on training pilots. In both figures a single?user MIMO link is considered, and channel estimation is carried out assuming that each transmit antenna sends an orthogonal pilot. The number of signaling intervals devoted to channel estimation coincides with the number of transmit antennas. Note that this is the minimum possible duration in order to be able to send orthogonal pilots. Channel estimation at μ?Wave frequencies (Fig. 4) is made using the linear minimum mean square errors criterion ([27]), while at mm?Wave frequencies (Fig. 5) the approximate maximum likelihood (AML) algorithm of [23] and the orthogonal Oja (OOJA) algorithm [25] are used. Comparing the figures, it is clearly seen that the gap between the case of estimated channel and the case of perfect CSI is smaller at mm?Wave frequencies, especially when the OOJA algorithm is considered. Conversely, this gap is larger at μ?Waves, and it grows with the dimension of the user antenna arrays. This behavior can be intuitively explained by virtue of the parametric form of the mm?Wave channel model in (5), which permits the development of efficient channel estimation algorithms.
3.5 Pilot Contamination Less Critical
Pilot contamination is the ultimate disturbance in massive MIMO systems operating at μ?Waves. As already discussed in the previous paragraphs, the impossibility to have a number of orthogonal pilots larger than the number of signaling intervals devoted to channel estimation leads to the use of pseudo?orthogonal, low cross?correlation sequences. Accordingly, in a massive MIMO system, when the MSs transmit their own pilot sequences in the uplink training phase to enable channel estimation at the BSs, every BS learns the channel from the intended MS, and also small pieces of the channels from the other MSs using pilots that are correlated to the one used by the intended MS. This phenomenon, in turn, causes a saturation in the achieved Signal?to?Interference plus Noise?Ratio (SINR) both in the downlink and in the uplink. The deceitful nature of pilot contamination was unveiled by Marzetta in his landmark paper [12] and since then, many authors have deeply investigated its effects and proposed strategies to counterbalance its effects [28], [29], [30]. All of these papers deal with the case of a μ?Wave massive MIMO system.
Pilot contamination at mm?Wave frequencies is instead a much less?studied topic (some initial results are reported in [31]). This is in part due to the fact that massive MIMO at mm?Waves is a more recent research topic than massive MIMO at μ?Waves. On the other hand, it may be envisioned that pilot contamination may be less critical at mm?Waves than it has revealed at μ?Waves, mainly for the short?range nature of mm?Wave links. In particular, while the range of μ?Wave links can be in the order of thousands of meters, the range for mm?Wave links will be more than one order of magnitude smaller, due to the increased path loss and a larger relevance of signal blockages. mm?Wave frequencies will be used for short?range communications in small cells, which, by nature, usually serve a smaller number of users than conventional micro?cells and macro?cells. Therefore, on one hand, the signals transmitted by the MSs during uplink training fade rapidly with the distance, and thus they should not be a serious impairment to surrounding BSs learning the channel from their intended MSs; on the other hand, the reduced number of users in each cell will lead to a less severe shortage of orthogonal pilots. The results in [31] seem to confirm such increased resilience of mm?Waves to the pilot contamination problem.
3.6 Antenna Diversity/Selection Procedures Less Effective
The i.i.d. nature of the fast fading component in the MIMO channel matrix at μ?Waves in (1) leads to a monotonic increase with the number of antennas, of the diversity order that can be attained. In particular, an channel brings a diversity order equal to , thus implying that the average error probability decreases to a zero, in the limit of large Signal?to?Noise Ratio (SNR), as . Such a diversity order can be attained through a simple antenna selection procedure by picking the transmit and receive antennas corresponding to the entry with the largest magnitude in the channel matrix . Looking at this fact from a different perspective, we can recall the well?known probability result stating that the maximum of a set of positive i.i.d. random variables taking value in the interval , becomes unbounded as the cardinality of the set diverges. As a consequence, for increasing number of antennas, the probability of observing a very large entry in the channel matrix rapidly increases. The open literature is rich of studies exploiting this peculiarity of μ?Wave MIMO channels and proposing diversity techniques based on antenna selection procedures (e.g. [32] and [33]).
At mm?Waves, instead, given the parametric channel model of (5), a different behavior is observed. In particular, the entries of the matrix channel have no longer an i.i.d. component, and this implies that the maximum of the magnitudes of the entries of grows at a much reduced pace. As a consequence, diversity techniques using antenna selection procedures are less effective.
As an experimental evidence of this fact, Fig. 6 compares the parameter in (9), for different values of , and for both the μ?Wave and mm?Wave channel models.
. (9)
The quantity is the ratio between the largest squared magnitude among the entries of , and the average squared magnitude. The larger is, the more unbalanced are the magnitudes of the entries of the channel matrix, since basically measures how far is the largest entry in from the average magnitude. Fig.6, shows that the parameter is in general an increasing function of the number of antenna elements, but it grows much more rapidly in the case of μ?Wave channels.
4 Conclusions
This paper outlined a critical comparison between massive MIMO systems at mm?Waves and at μ?Waves. Six key differences were outlined, and their implications on the transceiver architecture and on the attainable performance were discussed and validated also through the result of computer simulations. Among the discussed differences, we believe that the most disruptive one is the first difference, i.e. the fact that MIMO systems may be doubly massive at mm?Waves. Indeed, while it has been shown that the use of large?scale antenna arrays does not have an as beneficial impact on the system multiplexing capabilities as it has at μ?Wave frequencies, the availability of doubly massive MIMO wireless links will enable the generation of very narrow beams, resulting in reduced co?channel interference to other users using the same time?frequency resources. Another key advantage of doubly massive MIMO systems at mm?Waves is the fact that the computational complexity of channel estimation weakly depends on the number of antennas, especially for the case in which analog (beam?steering) beamforming strategies are used. While massive MIMO at μ?Wave frequencies is gradually entering in 3GPP standards, mm?Waves and in particular massive mm?Wave MIMO systems are still under heavy investigation, both in academia and industry. It is however anticipated that sooner or later a technology readiness level will be reached such that they will be included in 3GPP standards. The authors of this paper hope that this article will help to move us forward along this road.