Traditional Culture Encyclopedia - Weather inquiry - How to capture dynamic data in web pages

How to capture dynamic data in web pages

First of all, make clear what I mean by dynamic data.

Definition of nouns: The dynamic data here refers to the page content dynamically generated by Javascript in a web page, that is, it is not in the source file of the web page, but dynamically generated after the page is loaded into the browser.

Let's get down to business.

Grabbing a static page is very simple. You can get the html source code through Java, and then get the information you want by analyzing the source code. If you want to get the weather of China Weather Network, you only need to find the corresponding html page (/weather/1012101.shtml).

Suppose I need to enter the city name to get the weather of the city, and the data source is still China Weather Network. The first thing to do is to find the corresponding page according to the city. Through simple analysis, it is found that the URL of the city corresponding page, such as Hangzhou corresponding10121kloc-0/0/,so the key of the program is to find the corresponding relationship between the city and the page.

It is found that the search box of this website has links to most cities in China, and the corresponding relationship between cities and _id can be obtained. Find a breakthrough and start action. Go to the home page, check its source code and find the location of the search box.

The original data is dynamically added through Javascript, and you can see the following with the inspect element of Chrome.

What we can do now is to copy html into a file with Chrome, and then parse the file to get the relationship between the city and URL. The problem is that if the correspondence between the city and URL of the website changes, it will be very passive and the program needs to be changed.

The problem now is how to get the html content dynamically generated by Javascript with Java. I don't know what you think.