Extract Cookie from a site with Puppeteer
Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium. Learn more about puppeteer here.In this article, we will write a script that will automatically logs you in from the google oauth form and scrape the cookie data from the site.
Below are the use cases of it:
- Test automation in modern web applications: verifying that the features we are exposing our users/customers to are actually behaving as expected.
- Taking screenshots of web pages: useful for a variety of different uses going from simple archiving to automated comparison for e.g. visual testing.
- Scraping web sites for data: extracting data from websites for later retrieval or analysis.
- Automating interaction of web pages: speed up and scale any sort of sequence of actions we would like to perform on a website automatically.
Let's scrape a site,shall we?
Create a Project Folder and structure it as below:
root
------------ script.js
----- cookies.json
----- config.json
In cookie,json file.Add a { } as below:
{ }
``
In config.json. Add mail and password as below:
```json
{
"mail":"your email",
"password":"your password"
}
Now lets install puppeteer:
npm i puppeteer
Start writing the script in script.js file.copy the below code:
const puppeteer = require("puppeteer");
const fs = require("fs");
const cookies = require("./cookies.json");
const config = require("./config.json");
(async () => {
try {
//* start puppeteer and creates a new page
let browser = await puppeteer.launch({ headless: false });
let page = await browser.newPage();
//* check for any saved session
if (Object.keys(cookies).length) {
//* set the saved cookies in the pageawait page.setCookie(...cookies);
//* go to sentinal
await page.goto("https://sentinel.zerodha.com/",
{ waitUntil: "networkidle2", });
} else {
await page.goto("https://sentinel.zerodha.com/login",
{ waitUntil: "networkidle0", });
//* brings login page
await page.click('a[href="/api/accounts/google/login/"]', { delay: 30 });
//* adds emailawait page.waitForSelector('input[id="identifierId"]');
await page.type('input[id="identifierId"]', config.mail, {
delay: 30,
});
await page.click("#identifierNext");
//* adds password
await page.waitForSelector('input[type="password"]', { visible: true });
await page.type('input[type="password"]', config.password);
await page.waitForSelector("#passwordNext", { visible: true });
await page.click("#passwordNext");
//* get the current browser page session
let currentCookies = await page._client.send("Network.getAllCookies");
//* creates a cookie file to store the session
fs.writeFileSync("./cookies.json", JSON.stringify(currentCookies));
await browser.close();
}
} catch (e) {
console.log(e);
}
})();
To run the script:
node script.js
Check the cookie.json file. The file gets the cookie from the site. There are more stuffs that you can do.Hope this helps.
Happy coding...