The methodology of the P3 connect Mobile Benchmark is the result of more than 15 years of testing mobile networks. Today, network tests are conducted in more than 80 countries. Our methodology was carefully designed to evaluate and objectively compare the performance and service quality of mobile networks from the users’ perspective.
DRIVETESTS AND WALKTESTS
The drivetests and walktests in Spain took place throughout October 2018. All samples were collected during the day, between 8.00 a.m. and 10.00 p.m. The network tests covered inner-city areas, outer metropolitan and suburban areas. Measurements were also taken in smaller towns and cities along connecting highways. The connection routes between the cities alone covered about 2,100 kilometres per car – 8,400 kilometres for all four cars. In total, the four vehicles together have covered about 12,400 kilometres.
The combination of test areas has been selected to provide representative test results across the Spanish population. The areas selected for the 2018 test account for more than 12 million people, or roughly 25.8 per cent of the total population of Spain. The test routes and all visited cities and towns are shown on page 1 of this report. The four drive-test cars were equipped with arrays of Samsung Galaxy S8 smartphones for the simultaneous measurement of voice and data services.
One smartphone per operator in each car was used for the voice tests, setting up test calls from one car to another. The walktest team also carried one smartphone per operator for the voice tests. In this case, the smartphones called a stationary counterpart. The audio quality of the transmitted speech samples was evaluated using the HD-voice capable and ITU standardised so-called POLQA wideband algorithm.
All smartphones used for the voice tests were set to VoLTE preferred mode. In networks or areas where this modern 4G-based voice technology was not available, they would perform a fallback to 3G or 2G.
As a new KPI in 2018, we assess the so-called P90 value for call setup times. P90 values specify the threshold in a statistical distribution, below which 90 per cent of the gathered values are ranging.
In order to account for typical smartphone-use scenarios during the voice tests, background data traffic was generated in a controlled way through random injection of small amounts of HTTP traffic. The voice scores account for 34 per cent of the total results.
Data performance was measured by using four more Galaxy S8 in each car – one per operator. Their radio access technology was set to LTE preferred mode.
For the web tests, they accessed web pages according to the widely recognised Alexa ranking. In addition, the static “Kepler” test web page as specified by ETSI (European Telecommunications Standards Institute) was used. In order to test the data service performance, files of 3 MB and 1 MB for download and upload were transferred from or to a test server located on the Internet. In addition, the peak data performance was tested in uplink and downlink directions by assessing the amount of data that was transferred within a seven seconds time period.
The evaluation of YouTube playback takes into account that YouTube dynamically adapts the video resolution to the available bandwidth. So, in addition to success ratios, start times and playouts without interruptions, the measurements also determined average video resolution.
All the tests were conducted with the best-performing mobile plan available from each operator. Data scores account for 51 per cent of the total results.
Additionally, P3 conducted crowd-based analyses of the Spanish networks which contribute 15 per cent to the end result. They are based on data that gathered in July, August and September, 2018.
For the collection of crowd data, P3 has integrated a background diagnosis processes into 800+ diverse Android apps. If one of these applications is installed on the end-user’s phone and the user authorizes the background analysis, data collection takes place 24/7, 365 days a year. Reports are generated for every hour and sent daily to P3‘s cloud servers.
Such reports generate just a small number of bytes per message and do not include any personal user data. Interested parties can deliberately take part in the data gathering with the specific ”U get“ app (see below). This unique crowdsourcing technology allows P3 to collect data about real-world experience wherever and whenever customers use their smartphones.
For the assessment of network coverage, P3 lays a grid of 2 by 2 kilometres over the whole test area. The “evaluation areas“ generated this way are then sub-divided into 16 smaller tiles. To ensure statistical relevance, P3 requires a certain number of users and measurement values per operator for each tile and each evaluation area. If these thresholds are not met by one of the operators, this part of the map will not be considered in the assessment for the sake of fairness.
The “Quality of Coverage“ reveals whether voice and data services actually work in the respective evaluation area. P3 does this because not in each area that allegedly provides network reception, mobile services can actually be used. We specify these values for the coverage of voice services (2G, 3G and 4G combined), data (3G and 4G combined) and 4G only.
Additionally, P3 investigates the data rates that were actually available to each user. For this purpose, we determine the best obtained data rate for each user during the evaluation period and then calculate their average value. In addition, we determine the P90 values (see previous page) for the top throughput of each evaluation area as well as of each user‘s best throughput. These values depict how fast the network is under favorable conditions.
DATA SERVICE AVAILABILITY
Formerly called “operational excellence“, this parameter indicates the number of outages or “service degradations“ – events where data connectivity is impacted by a number of cases that significantly exceeds the expectation level. To judge this, the algorithm looks at a sliding window around the hour of interest. This ensures that we only consider actual degradations as opposed to a simple loss of network coverage due to prolonged indoor stays or similar reasons.
In order to ensure statistical relevance, valid assessment months and hours must fulfil distinct requirements. Each operator must have sufficient statistics for trend and noise analyses per each evaluated hour. The exact number depends on the market size and number of operators. A valid assessment month must comprise of at least 90 per cent of valid assessment hours. Deviating from the other crowd score elements, Data Service Availability is rated based on a five-month observation period – in this case from May to September 2018.
PARTICIPATE IN OUR CROWDSOURCING
Everybody interested in being a part of our global crowdsourcing panel and obtaining insights into the reliability of the mobile network that her or his smartphone is logged into, can most easily participate by installing and using the “U get“ app. This app exclusively concentrates on network analyses and is available under http://uget-app.com.
“U get“ checks and visualises the current mobile network performance and contributes the results to our crowdsourcing platform. Join the global community of users who understand their personal wireless performance, while contributing to the world’s most comprehensive picture of mobile customer experience.
Vodafone wins for the fourth time, Orange takes the second rank from Movistar and manages to improve clearly over last year‘s results. And once again, Yoigo shows considerable progress compared to previous years.
For the third time in a row, Vodafone is the clear winner of the P3 connect Mobile Benchmark in Spain. This may not come as a surprise, but one should bear in mind that it takes a lot of effort to secure the top position.
While also Yoigo and Orange show considerable improvements in their scores, Movistar remained essentially at the same level. Telefónica’s mobile network scores quite well in the data discipline, but it falls behind its competitor Orange in the voice tests. All in all, this is a good result for Spain‘s largest operator, with Movistar well deserving the second rank.
Orange also improves over 2016‘s results and while still scoring third, it narrows the gap to Movistar to only four points. In the voice discipline, Orange delivers short call setup times and good speech quality thanks to its introduction of VoLTE. Currently, Orange is the only Spanish operator which offers this modern technology to its customers.
Yoigo improves by more than 100 points
Yoigo makes the biggest jump ahead. Even when taking the changes to the maximum available points entailed by our new crowd score into account, the smallest contender improved its score by more than 100 points. This is a remarkable accomplishment and good news for Yoigo‘s customers. Above that, our new crowdsourced operational excellence score also delivers enjoyable results, confirming a high level of stability and availability of all Spanish networks in the observation period.
For the fourth time in a row, Vodafone is the clear winner of the P3 connect Mobile Benchmark Spain thanks to a distinct lead both in the voice and data categories. At present being the third largest Spanish operator, Vodafone proves to be capable ofdelivering high performance and quality to its customers.
A “very good“ Orange has not only taken the second rank from its rival Movistar but also managed to increase its customer base, currently being the second largest mobile operator in Spain. This success is based on considerable improvement efforts – which is proven by our benchmark and not least its crowd component
With strong data results and a still good voice score, the largest Spanish operator ranks third. Compared to its 2017 scores, the Telefónica brand has slightly improved in the voice discipline, and more or less kept the same performance in the data category. All in all, this operator achieves the grade “good“.
Although Spain‘s smallest operator scores fourth, the comparison to last year‘s result reveals a distinct improvement in the data and especially in the voice category. The other competitors may still be stronger, but this year, Yoigo‘s clear improvement efforts are rewared with a well-deserved overall grade “good“.