Puppeteer

Puppeteer Description

Puppeteer is a JavaScript library that provides a high-level API for controlling Chrome or Firefox via the DevTools protocol or WebDriver BiDi. Puppeteer runs headless by default, but can be configured to run in a headful browser.

Puppeteer Features

Most things you can do manually in the browser can be done with Puppeteer! Here are a few examples to get you started:

  • Automate form submissions, UI testing, keyboard input, etc.
  • Create an automated testing environment using the latest JavaScript and browser features.
  • Capture a timeline trace of your site to diagnose performance issues.
  • Test Chrome extensions.
  • Create screenshots and PDFs of web pages.
  • Scan a SPA (Single Page Application) and generate pre-rendered content (i.e. “SSR” (Server Side Rendering)).

Homepage:

Author: Mathias Bynens, Google Inc.

License: Apache-2.0 license

Puppeteer Help

Puppeteer is run and controlled using JavaScript code. Commands to run and perform actions in Puppeteer are written to files with .js extensions and run using node.

Puppeteer options are specified as part of the JavaScript code in the file that runs Puppeteer. JavaScript code is also used to perform various actions, namely the various Puppeteer classes and their methods.

Documentation for Puppeteer options, classes, and methods:

Puppeteer Manual

The man page is missing.

Puppeteer Usage Example

1. Example of taking a screenshot of a website

Create a file screenshot-fullpage.js and copy the following code into it:

const puppeteer = require('puppeteer');

async function run() {
	const browser = await puppeteer.launch();
	const page = await browser.newPage();
  
	await page.goto('https://suip.biz/?act=client-tls-fingerprinting');
	await page.screenshot({ path: 'ja4-full.png', fullPage: true });
  
	browser.close();
}

run();

Run the file like this:

node screenshot-fullpage

You will get a screenshot of the entire page.

2. Example of getting the DOM (HTML) of a web page

Create a file html-to-file.js with the following content:

const puppeteer = require('puppeteer');

async function run() {
	const browser = await puppeteer.launch();
	const page = await browser.newPage();
	const customUserAgent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36';
  
	await page.setViewport({width: 1440, height: 3440});
	await page.setUserAgent(customUserAgent);
	await page.goto('https://w-e-b.site/?act=client-tls-fingerprinting');
	
	const html = await page.content();
	const fs = require('fs');
	fs.writeFile("test.htm", html, function(err) {
	if(err) {
		return console.log(err);
	}
	console.log("The file was saved!");
	});
  
	browser.close();
}

run();

Run the file like this:

node html

As a result of running this program, the source code of the page obtained with Puppeteer will be saved to the file test.htm.

3. Example of using Puppeteer to extract all links to images

Create a file extract-art-img.js with the following content:

const puppeteer = require('puppeteer');

async function run() {
	const browser = await puppeteer.launch();
	const page = await browser.newPage();
	const customUserAgent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36';
  
	await page.setViewport({width: 1440, height: 3440});
	await page.setUserAgent(customUserAgent);
	await page.goto('https://hackware.ru/?p=19139');

	const tags = await page.evaluate(() => {
		return Array.from(document.querySelectorAll('div > div.entrytext > p > a > img'))
			.map(heading => heading.src);
	});
  
	for (const tag of tags) {
		console.log(tag);
	}
  
	browser.close();
}

run();

This code will show all links to images from an article on a specified page.

How to install Puppeteer

See also: Where does npm install packages? “Error: Cannot find module” even after installing the module

Installation on Kali Linux

Start by installing the JavaScript package manager, npm:

sudo apt install npm

Then run the following command:

npm i puppeteer

This command will install Puppeteer locally. To install Puppeteer globally, run the following command:

sudo npm i -g puppeteer

For details on local and global installation, as well as how to import a module installed globally, see the following link: https://suay.site/?p=4872

Installation on BlackArch

Start by installing the JavaScript package manager, npm:

sudo pacman -S npm

Then run the following command:

npm i puppeteer

This command will install Puppeteer locally. To install Puppeteer globally, run the following command:

sudo npm i -g puppeteer

For details on local and global installation, as well as how to import a module installed globally, see the following link: https://suay.site/?p=4872

Installation on Debian, Linux Mint, Ubuntu

Start by installing the JavaScript package manager, npm:

sudo apt update
sudo apt install npm

Then run the following command:

npm i puppeteer

How to update Puppeteer

To update Puppeteer on any distribution, run the following command:

npm update

Puppeteer Screenshots

Puppeteer Tutorials

Related tools

Recommended for you:

Comments are Closed

Рейтинг@Mail.ru