[ad_1]
Google affords PageSpeed Insights API to assist web optimization execs and builders by mixing real-world information with simulation information, offering load efficiency timing information associated to net pages.
The distinction between the Google PageSpeed Insights (PSI) and Lighthouse is that PSI includes each real-world and lab information, whereas Lighthouse performs a web page loading simulation by modifying the connection and user-agent of the gadget.
One other level of distinction is that PSI doesn’t provide any data associated to net accessibility, web optimization, or progressive net apps (PWAs), whereas Lighthouse supplies the entire above.
Thus, after we use PageSpeed Insights API for the majority URL loading efficiency take a look at, we received’t have any information for accessibility.
Nevertheless, PSI supplies extra data associated to the web page pace efficiency, akin to “DOM Measurement,” “Deepest DOM Youngster Factor,” “Whole Activity Depend,” and “DOM Content material Loaded” timing.
Yet one more benefit of the PageSpeed Insights API is that it offers the “noticed metrics” and “precise metrics” completely different names.
On this information, you’ll study:
- How one can create a production-level Python Script.
- How one can use APIs with Python.
- How one can assemble information frames from API responses.
- How one can analyze the API responses.
- How one can parse URLs and course of URL requests’ responses.
- How one can retailer the API responses with correct construction.
An instance output of the Web page Pace Insights API name with Python is under.
Screenshot from writer, June 2022Libraries For Utilizing PageSpeed Insights API With Python
The mandatory libraries to make use of PSI API with Python are under.
- Advertools retrieves testing URLs from the sitemap of an internet site.
- Pandas is to assemble the info body and flatten the JSON output of the API.
- Requests are to make a request to the particular API endpoint.
- JSON is to take the API response and put it into the particularly associated dictionary level.
- Datetime is to change the particular output file’s identify with the date of the second.
- URLlib is to parse the take a look at topic web site URL.
How To Use PSI API With Python?
To make use of the PSI API with Python, comply with the steps under.
- Get a PageSpeed Insights API key.
- Import the mandatory libraries.
- Parse the URL for the take a look at topic web site.
- Take the Date of Second for file identify.
- Take URLs into a listing from a sitemap.
- Select the metrics that you really want from PSI API.
- Create a For Loop for taking the API Response for all URLs.
- Assemble the info body with chosen PSI API metrics.
- Output the ends in the type of XLSX.
1. Get PageSpeed Insights API Key
Use the PageSpeed Insights API Documentation to get the API Key.
Click on the “Get a Key” button under.
Picture from builders.google.com, June 2022Select a venture that you’ve got created in Google Developer Console.
Picture from builders.google.com, June 2022Allow the PageSpeed Insights API on that particular venture.
Picture from builders.google.com, June 2022You will have to make use of the particular API Key in your API Requests.
2. Import The Obligatory Libraries
Use the strains under to import the basic libraries.
import advertools as adv
import pandas as pd
import requests
import json
from datetime import datetime
from urllib.parse import urlparse
3. Parse The URL For The Take a look at Topic Web site
To parse the URL of the topic web site, use the code construction under.
area = urlparse(sitemap_url)
area = area.netloc.break up(".")[1]
The “area” variable is the parsed model of the sitemap URL.
The “netloc” represents the particular URL’s area part. Once we break up it with the “.” it takes the “center part” which represents the area identify.
Right here, “0” is for “www,” “1” for “area identify,” and “2” is for “area extension,” if we break up it with “.”
4. Take The Date Of Second For File Title
To take the date of the particular operate name second, use the “datetime.now” methodology.
Datetime.now supplies the particular time of the particular second. Use the “strftime” with the “%Y”, “”%m”, and “%d” values. “%Y” is for the yr. The “%m” and “%d” are numeric values for the particular month and the day.
date = datetime.now().strftime("%Y_percentm_percentd")
5. Take URLs Into A Record From A Sitemap
To take the URLs into a listing kind from a sitemap file, use the code block under.
sitemap = adv.sitemap_to_df(sitemap_url) sitemap_urls = sitemap["loc"].to_list()
If you happen to learn the Python Sitemap Well being Audit, you’ll be able to study additional details about the sitemaps.
6. Select The Metrics That You Need From PSI API
To decide on the PSI API response JSON properties, it is best to see the JSON file itself.
It’s extremely related to the studying, parsing, and flattening of JSON objects.
It’s even associated to Semantic web optimization, because of the idea of “directed graph,” and “JSON-LD” structured information.
On this article, we received’t deal with inspecting the particular PSI API Response’s JSON hierarchies.
You’ll be able to see the metrics that I’ve chosen to collect from PSI API. It’s richer than the fundamental default output of PSI API, which solely offers the Core Internet Vitals Metrics, or Pace Index-Interplay to Subsequent Paint, Time to First Byte, and First Contentful Paint.
After all, it additionally offers “ideas” by saying “Keep away from Chaining Essential Requests,” however there is no such thing as a have to put a sentence into an information body.
Sooner or later, these ideas, and even each particular person chain occasion, their KB and MS values might be taken right into a single column with the identify “psi_suggestions.”
For a begin, you’ll be able to examine the metrics that I’ve chosen, and an essential quantity of them will likely be first for you.
PSI API Metrics, the primary part is under.
fid = []
lcp = []
cls_ = []
url = []
fcp = []
performance_score = []
total_tasks = []
total_tasks_time = []
long_tasks = []
dom_size = []
maximum_dom_depth = []
maximum_child_element = []
observed_fcp = []
observed_fid = []
observed_lcp = []
observed_cls = []
observed_fp = []
observed_fmp = []
observed_dom_content_loaded = []
observed_speed_index = []
observed_total_blocking_time = []
observed_first_visual_change = []
observed_last_visual_change = []
observed_tti = []
observed_max_potential_fid = []
This part contains all of the noticed and simulated basic web page pace metrics, together with some non-fundamental ones, like “DOM Content material Loaded,” or “First Significant Paint.”
The second part of PSI Metrics focuses on doable byte and time financial savings from the unused code quantity.
render_blocking_resources_ms_save = []
unused_javascript_ms_save = []
unused_javascript_byte_save = []
unused_css_rules_ms_save = []
unused_css_rules_bytes_save = []
A 3rd part of the PSI metrics focuses on server response time, responsive picture utilization advantages, or not, utilizing harms.
possible_server_response_time_saving = []
possible_responsive_image_ms_save = []
Notice: General Efficiency Rating comes from “performance_score.”
7. Create A For Loop For Taking The API Response For All URLs
The for loop is to take the entire URLs from the sitemap file and use the PSI API for all of them one after the other. The for loop for PSI API automation has a number of sections.
The primary part of the PSI API for loop begins with duplicate URL prevention.
Within the sitemaps, you’ll be able to see a URL that seems a number of occasions. This part prevents it.
for i in sitemap_urls[:9]:
# Forestall the duplicate "/" trailing slash URL requests to override the data.
if i.endswith("/"):
r = requests.get(f"https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url={i}&technique=cellular&locale=en&key={api_key}")
else:
r = requests.get(f"https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url={i}/&technique=cellular&locale=en&key={api_key}")
Keep in mind to examine the “api_key” on the finish of the endpoint for PageSpeed Insights API.
Verify the standing code. Within the sitemaps, there may be non-200 standing code URLs; these must be cleaned.
if r.status_code == 200:
#print(r.json())
data_ = json.hundreds(r.textual content)
url.append(i)
The subsequent part appends the particular metrics to the particular dictionary that we’ve created earlier than “_data.”
fcp.append(data_["loadingExperience"]["metrics"]["FIRST_CONTENTFUL_PAINT_MS"]["percentile"])
fid.append(data_["loadingExperience"]["metrics"]["FIRST_INPUT_DELAY_MS"]["percentile"])
lcp.append(data_["loadingExperience"]["metrics"]["LARGEST_CONTENTFUL_PAINT_MS"]["percentile"])
cls_.append(data_["loadingExperience"]["metrics"]["CUMULATIVE_LAYOUT_SHIFT_SCORE"]["percentile"])
performance_score.append(data_["lighthouseResult"]["categories"]["performance"]["score"] * 100)
Subsequent part focuses on “complete job” depend, and DOM Measurement.
total_tasks.append(data_["lighthouseResult"]["audits"]["diagnostics"]["details"]["items"][0]["numTasks"])
total_tasks_time.append(data_["lighthouseResult"]["audits"]["diagnostics"]["details"]["items"][0]["totalTaskTime"])
long_tasks.append(data_["lighthouseResult"]["audits"]["diagnostics"]["details"]["items"][0]["numTasksOver50ms"])
dom_size.append(data_["lighthouseResult"]["audits"]["dom-size"]["details"]["items"][0]["value"])
The subsequent part takes the “DOM Depth” and “Deepest DOM Factor.”
maximum_dom_depth.append(data_["lighthouseResult"]["audits"]["dom-size"]["details"]["items"][1]["value"])
maximum_child_element.append(data_["lighthouseResult"]["audits"]["dom-size"]["details"]["items"][2]["value"])
The subsequent part takes the particular noticed take a look at outcomes throughout our Web page Pace Insights API.
observed_dom_content_loaded.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["observedDomContentLoaded"])
observed_fid.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["observedDomContentLoaded"])
observed_lcp.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["largestContentfulPaint"])
observed_fcp.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["firstContentfulPaint"])
observed_cls.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["totalCumulativeLayoutShift"])
observed_speed_index.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["observedSpeedIndex"])
observed_total_blocking_time.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["totalBlockingTime"])
observed_fp.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["observedFirstPaint"])
observed_fmp.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["firstMeaningfulPaint"])
observed_first_visual_change.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["observedFirstVisualChange"])
observed_last_visual_change.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["observedLastVisualChange"])
observed_tti.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["interactive"])
observed_max_potential_fid.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["maxPotentialFID"])
The subsequent part takes the Unused Code quantity and the wasted bytes, in milliseconds together with the render-blocking sources.
render_blocking_resources_ms_save.append(data_["lighthouseResult"]["audits"]["render-blocking-resources"]["details"]["overallSavingsMs"])
unused_javascript_ms_save.append(data_["lighthouseResult"]["audits"]["unused-javascript"]["details"]["overallSavingsMs"])
unused_javascript_byte_save.append(data_["lighthouseResult"]["audits"]["unused-javascript"]["details"]["overallSavingsBytes"])
unused_css_rules_ms_save.append(data_["lighthouseResult"]["audits"]["unused-css-rules"]["details"]["overallSavingsMs"])
unused_css_rules_bytes_save.append(data_["lighthouseResult"]["audits"]["unused-css-rules"]["details"]["overallSavingsBytes"])
The subsequent part is to supply responsive picture advantages and server response timing.
possible_server_response_time_saving.append(data_["lighthouseResult"]["audits"]["server-response-time"]["details"]["overallSavingsMs"])
possible_responsive_image_ms_save.append(data_["lighthouseResult"]["audits"]["uses-responsive-images"]["details"]["overallSavingsMs"])
The subsequent part is to make the operate proceed to work in case there may be an error.
else:
proceed
Instance Utilization Of Web page Pace Insights API With Python For Bulk Testing
To make use of the particular code blocks, put them right into a Python operate.
Run the script, and you’re going to get 29 web page speed-related metrics within the columns under.
Screenshot from writer, June 2022Conclusion
PageSpeed Insights API supplies several types of web page loading efficiency metrics.
It demonstrates how Google engineers understand the idea of web page loading efficiency, and probably use these metrics as a rating, UX, and quality-understanding viewpoint.
Utilizing Python for bulk web page pace checks offers you a snapshot of your complete web site to assist analyze the doable consumer expertise, crawl effectivity, conversion charge, and rating enhancements.
Extra sources:
Featured Picture: Dundanim/Shutterstock
if( sopp != 'yes' && addtl_consent != '1~' ){
!function(f,b,e,v,n,t,s) {if(f.fbq)return;n=f.fbq=function(){n.callMethod? n.callMethod.apply(n,arguments):n.queue.push(arguments)}; if(!f._fbq)f._fbq=n;n.push=n;n.loaded=!0;n.version='2.0'; n.queue=[];t=b.createElement(e);t.async=!0; t.src=v;s=b.getElementsByTagName(e)[0]; s.parentNode.insertBefore(t,s)}(window,document,'script', 'https://connect.facebook.net/en_US/fbevents.js');
if( typeof sopp !== "undefined" && sopp === 'yes' ){ fbq('dataProcessingOptions', ['LDU'], 1, 1000); }else{ fbq('dataProcessingOptions', []); }
fbq('init', '1321385257908563');
fbq('track', 'PageView');
fbq('trackSingle', '1321385257908563', 'ViewContent', { content_name: 'pagespeed-insights-api-python-test', content_category: 'technical-seo digital-marketing-tools' }); }
[ad_2]
