Download "Обучение парсингу на Python #7 | Парсинг сайтов на фрилансе | Requests, Beautifulsoup"

"videoThumbnail Обучение парсингу на Python #7 | Парсинг сайтов на фрилансе | Requests, Beautifulsoup

Обучение парсингу на Python #1 | Парсинг сайтов | Разбираем методы библиотеки Beautifulsoup

Обучение парсингу на Python #1 | Парсинг сайтов | Разбираем методы библиотеки Beautifulsoup

Channel: PythonToday

Обучение парсингу на Python #8 | Выполняем заказ на фрилансе

Обучение парсингу на Python #8 | Выполняем заказ на фрилансе

Channel: PythonToday

Парсинг на Python | Подмена IP-адреса | Запросы через прокси | Proxy with Python Requests

Парсинг на Python | Подмена IP-адреса | Запросы через прокси | Proxy with Python Requests

Channel: PythonToday

Обучение парсингу на Python #4 | Парсинг сайтов | requests, beautifulsoup, lxml, proxy

Обучение парсингу на Python #4 | Парсинг сайтов | requests, beautifulsoup, lxml, proxy

Channel: PythonToday

Практика Python | Выполняем заказ на фрилансе | Видео в GIF | mp4 to gif

Практика Python | Выполняем заказ на фрилансе | Видео в GIF | mp4 to gif

Channel: PythonToday

Обучение парсингу на Python #3 | Парсинг динамического сайта | Выполняем заказ на фрилансе

Обучение парсингу на Python #3 | Парсинг динамического сайта | Выполняем заказ на фрилансе

Channel: PythonToday

Обучение парсингу на Python #10 | Ошибки при парсинге | Парсинг сайта

Обучение парсингу на Python #10 | Ошибки при парсинге | Парсинг сайта

Channel: PythonToday

Парсинг на Python, Selenium и BS4 | Выполняем заказ на фрилансе | Парсинг динамического сайта

Парсинг на Python, Selenium и BS4 | Выполняем заказ на фрилансе | Парсинг динамического сайта

Channel: PythonToday

python

python программирование

python для начинающих

python обучение

python 3

python programming

python tutorial

Scraping

web-scraping

web scraping

parsing

что такое парсинг

парсинг

парсинг сайтов

парсинг это

обучение парсингу

как правильно парсить

парсинг данных с сайта

парсинг что это

парсер

уроки python

Beautifulsoup python

lxml python

как парсить

парсинг сайтов python

методы beautifulsoup

Web-Scraping

программы на python

фриланс

фриланс заработок

00:00:07

friends, hello everyone, you are on the python

00:00:10

today channel and today we will practice

00:00:12

scraping

00:00:13

and together we will complete an order. The client needed to

00:00:15

write a script to collect data from the

00:00:18

watch website. They needed to collect

00:00:20

the name of the link models and of course

00:00:23

the price. The output script should be generated by

00:00:26

Jason and sisvi files with the recorded

00:00:28

data and save them under the current date

00:00:31

we did the work together with one of

00:00:33

my padawans artem hello without you

00:00:36

this video wouldn’t be

00:00:38

millionaires we haven’t become millionaires yet but we earned our 20

00:00:41

bucks an hour before

00:00:43

we start I want to say a special thank you to the

00:00:45

following subscribers friends thank

00:00:48

you for contribution to the development of the channel, thank you

00:00:50

for appreciating my work, the

00:00:52

videos are coming out largely thanks to

00:00:54

your support, we will need the

00:00:56

request

00:00:57

beautiful soup and l xml libraries, if you

00:01:00

do not have them installed yet, we will install them

00:01:03

in our virtual environment with the following command,

00:01:08

we will import them, we will

00:01:14

create a function, we will try to send a

00:01:17

request to the site, we will save the answer and

00:01:19

let's see what we can take away,

00:01:21

create a dictionary for request headers,

00:01:24

put the edjing browser user in it,

00:01:56

then send a request to the page,

00:01:59

call the get method of the request library, the

00:02:02

parameters of which we pass the legal address,

00:02:11

write the conditions for creating a directory

00:02:14

in it, save html files so as

00:02:17

not to clog up the project tree, import

00:02:21

module with a groove and call the exist method in

00:02:30

the parameters of which we pass the name of

00:02:32

the directory we want to create, everything is

00:02:35

easy to read, if the specified

00:02:38

path does not exist, then we turn to the

00:02:40

mk1 method and create a directory, we save the

00:02:45

result of the requests to the file, we turn to the

00:02:47

context manager is open, we set the

00:02:50

file name

00:02:57

and write to it the result obtained

00:02:59

by calling the text method on the library object

00:03:02

request friends, if you like to work Windows and

00:03:05

during the video you will have

00:03:07

problems with encoding when writing and

00:03:09

reading from a file, especially

00:03:11

Cyrillic recognition, then watch

00:03:13

the video about recording data all with vip-file, we

00:03:16

analyzed this let

00:03:18

's run the task and see what we

00:03:20

got, pick it up, open the page in the

00:03:26

browser

00:03:27

[music]

00:03:32

naturally we got naked honor elbe

00:03:34

styles

00:03:35

and here we compare our clocks

00:03:42

with the original everything is ok but

00:03:48

not all clocks are on the page we received

00:03:52

by clicking on the show button

00:03:54

we get more the next portion, yes, we can

00:03:58

click on the selenium button

00:03:59

or look in the network to look for the

00:04:01

sent request and see what

00:04:03

comes in response, but if we

00:04:05

look carefully at the page that we

00:04:07

managed to save, then below we will see a block of

00:04:10

nations decline; please note that on the

00:04:16

original page of this class there

00:04:18

are no such things at all

00:04:20

based on the number of pages, we need to

00:04:24

take the number 5 and then write a loop in

00:04:26

which we will move to each

00:04:28

page, save the source code and then

00:04:31

parse it, comment out the request code,

00:04:34

we don’t need it yet, we have the source code

00:04:37

saved, read the resulting page into the

00:04:39

k heart variable and proceed to

00:04:41

parking

00:04:47

we create a beautiful soup object in

00:04:50

the parameters of which we pass the

00:04:52

src variable and varnish cemil parser, then we designate

00:04:57

the variable p discount and let's take

00:04:59

the number of pages the

00:05:00

links have no classes,

00:05:04

but for us this is not a problem, we will find

00:05:06

the parent div with the class Belexpo Guinea

00:05:08

tire container and then collect

00:05:10

all the links from it we need the number to be in the

00:05:14

penultimate link

00:05:27

since we now have a list

00:05:29

using index -2 and using the

00:05:32

text method we get the number 5. If

00:05:35

something is unclear to you right now, watch the

00:05:38

first video on a detailed analysis of the

00:05:40

main methods of the beautiful

00:05:42

soup library, I think there will be no questions left, there

00:05:45

will be a link in the description,

00:05:46

we convert the resulting string to a number and

00:05:50

write a for loop in which we need to

00:05:52

go through 5 pages, use the

00:05:55

range function and add one to our number

00:05:58

since the range function does not take into

00:06:00

account the last number, that is, if

00:06:02

we specify from one to five, we will get the

00:06:07

result numbers from one to four

00:06:14

we form a ural for requests

00:06:39

we print the result we receive 5

00:06:45

links we go to the last one and check

00:06:48

whether all the watches we managed to assemble the

00:06:57

latest model with a price of almost 106 thousand

00:07:00

we load all the watches from the first page and

00:07:06

everything is correct then we send a request in a

00:07:14

cycle to each of 5 pages

00:07:22

and save them under different names

00:07:27

the name will differ numbers in

00:07:29

accordance with the iterations put a

00:07:41

short pause between each iteration

00:07:44

so that the request has time to load the data

00:07:48

let our function return the

00:07:50

number of pages

00:07:51

we will need this value in the next

00:07:53

function run the code in the directory

00:08:11

all 5 pages

00:08:13

appear open the last one and check the clocks, everything is

00:08:23

fine, we managed to collect all the necessary

00:08:26

pages, all that remains is to parse them, collect

00:08:29

and save the data we need,

00:08:31

create a new function collect data

00:08:35

accept it will sing the jazz count

00:08:37

obtained from the first function, first

00:08:40

of all we write a loop in which we will read the

00:08:42

fock page in range received earlier

00:08:45

from 1 to the iscount page,

00:08:50

open the file and save the contents into a

00:08:53

variable,

00:09:06

create a beautiful soup object, go to the

00:09:13

site and look at what we can

00:09:15

grab onto, the data we need lies in

00:09:25

div blocks with strange IT people,

00:09:27

we fall deeper and see so hey, the

00:09:30

attribute of which contains a link to a

00:09:32

detailed description watches and inside there are

00:09:35

several Peterhof that interest us,

00:09:37

in one of them there is a model of a watch and

00:09:40

in the other the price is super

00:09:42

so it has a class, copy it and

00:09:45

see if there are any extra tags with this

00:09:47

class great with this class

00:09:54

all the cards we need for the watch go

00:10:01

create a variable call the fine method

00:10:06

all we pass the tag as the first argument and the

00:10:09

class by which we select as the second.

00:10:12

Now we have a list of the necessary

00:10:15

cards, we write a for loop and go through each one,

00:10:21

first we select the article,

00:10:27

it is located in the pi tag with the class product

00:10:30

iten and tickle, we turn to the fine method,

00:10:37

specify petek

00:10:40

then the class and we get the contents

00:10:44

using the text method, similarly we find

00:10:47

and take the price list

00:10:48

and then ural

00:11:15

is located in the id attribute sheriff

00:11:19

we use the get method in the parameters

00:11:22

of which we pass the desired value

00:11:24

we print the result we first work

00:11:39

with one page

00:11:43

we call the function and run the script the code

00:11:54

works but needs to be corrected a little

00:11:56

first, let's cut off the

00:11:58

spaces in the article in the price list inside the

00:12:05

pi tag, there is also a line rub for us;

00:12:45

place the value ph discount

00:12:50

we will create a list for our data at

00:12:56

each iteration of the loop we will fill

00:12:59

it with dictionaries with new values after

00:13:21

all the pages have been processed and the

00:13:23

cards have been collected we start writing

00:13:26

first to the Jason file we open the file for

00:13:29

writing with the hey flag we import the

00:13:38

Jason module we call the dump method

00:13:48

we pass our list as the first parameter,

00:13:50

then the file is the indentation intent and the parameter n

00:13:56

shura and s si ai ai with the fall flag, I have

00:14:00

already explained the meaning of these parameters more than once in

00:14:02

previous videos on parsing, see the

00:14:05

tex playlist, I almost forgot about the requirement to

00:14:08

save files under the current date,

00:14:10

import daytime module and get the

00:14:13

current date in

00:14:25

the format day month year substitute

00:14:31

the value of the variable in the name run

00:14:38

the code and see what we get in

00:14:44

the directory Jason file

00:14:46

appears open and here is the collected data with a

00:14:49

beautiful indentation everything is great now

00:14:52

let’s write the code to record everything with the vip-file

00:14:54

first write it down column headers,

00:14:57

of course, you don’t have to do this;

00:14:59

write the data at once; it all depends on the

00:15:01

customer’s imagination; open the file for

00:15:04

writing; create a writer; import the

00:15:16

module all the ESV

00:15:24

into the write method; transfer our file;

00:15:27

call the writer’s method in heaven troll;

00:15:30

and in the tuple we list the desired

00:15:33

headers; article link and price; in the

00:15:41

loop at each iteration we will add

00:15:44

lines to our file, everything is the same, only

00:15:49

the flag changes to append and of course we change

00:15:52

the value of the columns to the data we collected,

00:15:55

delete the previous file and

00:16:06

run the script with

00:16:15

Jason, everything is ok,

00:16:20

open the ESV and all the super data

00:16:27

is collected,

00:16:28

I hope the video was useful to you so

00:16:31

don’t forget to like the entire code

00:16:33

you can download on github

00:16:35

or in the telegram channel where you will find a

00:16:37

lot more useful information

00:16:39

subscribe links will be in the description

00:16:41

friends thank you so much for watching

00:16:44

if the video was useful and

00:16:46

interesting to you and you want to get more

00:16:47

practice on python and other languages,

00:16:50

be sure to like and share

00:16:52

your opinions or ideas in the comments,

00:16:54

subscribe to the channel, be healthy,

00:16:57

bye everyone

Description:

Обучение (Web-Scraping) веб парсингу на Python. В данном видео выполняем заказ на фрилансе по парсингу сайта с помощью библиотек requests и Beautifulsoup4. Научимся делать запросы, сохранять страницы, парсить из них нужную нам информацию, а после сохраним данных в файлы json и CSV формата, т.е в таблицы. 💰 Поддержать проект: https://yoomoney.ru/to/410019570956160 🔥 Стать спонсором канала: https://www.youtube.com/channel/UCrWWcscvUWaqdQJLQQGO6BA/join *****Ссылки***** Дешевый/надежный сервер в Европе: https://zomro.com/?from=246874 promo_code: zomro_246874 Хороший proxy сервис: https://proxy6.net/ Крутой заказ на фрилансе | Подбираем забытый пароль к Excel файлу с помощью Python https://www.youtube.com/watch?v=DXVs0rJ6OPM Пишем Telegram бота на Python + Загружаем Telegram бота на сервер(хостинг): https://www.youtube.com/watch?v=x-VB3b4pKcU Плейлист по распознаванию лиц на Python: https://www.youtube.com/playlist?list=PLqGS6O1-DZLpVl2ks4S_095efPUgunsJo Плейлист по парсингу сайтов на Python: https://www.youtube.com/playlist?list=PLqGS6O1-DZLprgEaEeKn9BWKZBvzVi_la Плейлист по Instagram боту: https://www.youtube.com/playlist?list=PLqGS6O1-DZLqYx83MknKLaDxaIlES2nZr Код проекта на github: https://github.com/pythontoday/scrap_tutorial И в telegram канале: https://t.me/python2day *****Соц.сети***** Telegram: https://t.me/python2day

Preparing download options

Popular

HD video

Only sound

All

* — If the video is playing in a new tab, go to it, then right-click on the video and select "Save video as..."

** — Link intended for online playback in specialized players

Questions about downloading video

How can I download "Обучение парсингу на Python #7 | Парсинг сайтов на фрилансе | Requests, Beautifulsoup" video?

http://unidownloader.com/ website is the best way to download a video or a separate audio track if you want to do without installing programs and extensions.
The UDL Helper extension is a convenient button that is seamlessly integrated into YouTube, Instagram and OK.ru sites for fast content download.
UDL Client program (for Windows) is the most powerful solution that supports more than 900 websites, social networks and video hosting sites, as well as any video quality that is available in the source.
UDL Lite is a really convenient way to access a website from your mobile device. With its help, you can easily download videos directly to your smartphone.

Which format of "Обучение парсингу на Python #7 | Парсинг сайтов на фрилансе | Requests, Beautifulsoup" video should I choose?

The best quality formats are FullHD (1080p), 2K (1440p), 4K (2160p) and 8K (4320p). The higher the resolution of your screen, the higher the video quality should be. However, there are other factors to consider: download speed, amount of free space, and device performance during playback.

Why does my computer freeze when loading a "Обучение парсингу на Python #7 | Парсинг сайтов на фрилансе | Requests, Beautifulsoup" video?

The browser/computer should not freeze completely! If this happens, please report it with a link to the video. Sometimes videos cannot be downloaded directly in a suitable format, so we have added the ability to convert the file to the desired format. In some cases, this process may actively use computer resources.

How can I download "Обучение парсингу на Python #7 | Парсинг сайтов на фрилансе | Requests, Beautifulsoup" video to my phone?

You can download a video to your smartphone using the website or the PWA application UDL Lite. It is also possible to send a download link via QR code using the UDL Helper extension.

How can I download an audio track (music) to MP3 "Обучение парсингу на Python #7 | Парсинг сайтов на фрилансе | Requests, Beautifulsoup"?

The most convenient way is to use the UDL Client program, which supports converting video to MP3 format. In some cases, MP3 can also be downloaded through the UDL Helper extension.

How can I save a frame from a video "Обучение парсингу на Python #7 | Парсинг сайтов на фрилансе | Requests, Beautifulsoup"?

This feature is available in the UDL Helper extension. Make sure that "Show the video snapshot button" is checked in the settings. A camera icon should appear in the lower right corner of the player to the left of the "Settings" icon. When you click on it, the current frame from the video will be saved to your computer in JPEG format.

What's the price of all this stuff?

It costs nothing. Our services are absolutely free for all users. There are no PRO subscriptions, no restrictions on the number or maximum length of downloaded videos.