python爬虫实例——获取肯德基餐厅位置

学了python爬虫也有一段时间了,试着写了一小段代码看看自己掌握的程度, 目标是准确爬取到肯德基在指定城市的所有餐厅位置,话不多说,直接上代码:

import requests #导入库
if __name__ == "__main__":
    #指定url
    url = "http://www.kfc.com.cn/kfccda/ashx/GetStoreList.ashx?op=keyword"

首先要做的就是导入requests库,接下来按照爬取步骤就是得到url链接了,这个链接是从肯德基官网得到的,首先打开网站,接着按F12进入检查模式

依次点击完之后发现下面并没有显示任何数据,这时我们在网站的查询栏里随便输入一个城市,来看看会出现什么

划到最下面

在网站点击查¥查询之后会出来一组数据,点开就能得到爬虫需要的很多数据了。

  • Request URL: 后面的链接就是发送请求的url。
  • Request Method: POST 告诉我们这个请求是属于post请求,而不是get请求。
  • Content-Type: text/plain; charset=utf-8 这个代表的是数据类型为text/plain,所以后面就不能用json。
  • User-Agent: 这个就是我们用来UA伪装的数据了,等会也会使用到。
  • From Data:
    • cname: #这里是空的所以不用管
    • pid: #也是空的所以不用管
    • keyword: 深圳 #我刚刚查询的是深圳,所以这里显示了一个深圳,表示这个是关键词
    • pageIndex: 1 #这个表示的页码
    • pageSize: 10 #这个是每页显示的地址数量

分析完之后可以接着开始写代码了:

import requests #导入库
if __name__ == "__main__":
    #指定url
    url = "http://www.kfc.com.cn/kfccda/ashx/GetStoreList.ashx?op=keyword"
#自定义查询
    kw = input("请输入要查询的城市:")
    keyword = kw
    #编辑参数
    data = {
         'cname':'',
         'pid':'',
         'keyword': kw, #查询关键字
         'pageIndex': '1',#页码
         'pageSize': '100'#每页显示的地址数量
    }

由于我不想固定的只获取到某个地区的数据,所以在这里把keyword参数写成用户自定义的类型,其它参数可以不变,如果想获取到当前地区所有的地址的话,可以把'pageSize'的值附高一点

接下来就是进行UA伪装,把我们获取到的User-Agent:写到代码里面就可以了,接着发送请求,这里注意一下刚刚看到网页是用的post请求,所以我们下面也需要使用post来发送请求:

    response = requests.post(url=url,data=data,headers=headers)

发送完请求会得到返回数据,把数据保存下来。到此整个代码就结束了。最后运行一下看看会不会报错,是否准确得到了自己想要的数据。我把完整代码放到下面,可以参考一下:

import requests #导入库

if __name__ == "__main__":
    #指定url
    url = "http://www.kfc.com.cn/kfccda/ashx/GetStoreList.ashx?op=keyword"
    #自定义查询
    kw = input("请输入要查询的城市:")
    keyword = kw
    #编辑参数
    data = {
         'cname':'',
         'pid':'',
         'keyword': kw, #查询关键字
         'pageIndex': '1',#页码
         'pageSize': '100'#每页显示的地址数量
    }
    #UA伪装
    headers = {'User-Agent': '填入自己的伪装地址'}
    #发送请求
    response = requests.post(url=url,data=data,headers=headers)
    #获取数据
    list_data = response.text

    #数据化保存
    filename = kw + 'kfc.html'
    with open(filename,'w',encoding='utf-8') as fp:
        fp.write(list_data)

    print(filename,"获取完毕!!")
21 COMMENTS
  1. 2021/08/03
    Ohxyez

    viagra cialis levitra - cheap cialis tops pharmacy

  2. 2021/08/04
    Sajysq

    can i buy viagra without a prescription - cheap online viagra cheap viagra overnight

  3. 2021/08/12
    Dufwqo

    deltasone 25mg from canada - purchase prednisone no prescription prednisone 20 tablet

  4. 2021/08/15
    Twjbaf

    cialis 20mg tablet - Cialis daily where can i get tadalafil

  5. 2021/08/17
    Nujgyj

    generic stromectol for humans - ivermectin for humans buy ivermectin uk

  6. 2021/08/19
    Yqivca

    erectile dysfunction natural treatment - buy medications online rhino ed pills

  7. 2021/08/22
    Xidhdp

    ventolin inhaler - gnrventolin ventolin price uk

  8. 4周前
    Itjzah

    cytotec 800 - cytotec where to buy cytotec where to buy uk

  9. 4周前
    Pnojvq

    doxycycline 200 mg - doxycycline 100mg tablets nz doxycycline vibramycin

  10. 4周前
    Kevvft

    how much is neurontin - neurontin 400mg order levothyroxine from canada

  11. 4周前
    sistono

    https://buystromectolon.com/ - stromectol for scabies

  12. 4周前
    Dhpjdh

    female viagra tablets price in india - order viagra uk

  13. 4周前
    web hosting it

    you're actually a just right webmaster. The web site loading
    pace is amazing. It seems that you're doing any distinctive trick.
    In addition, The contents are masterwork. you have performed a
    wonderful activity on this matter!

  14. 3周前
    Nwfgwm

    tadalafil canada cost - online cialis pharmacy buy made in usa cialis online

  15. 3周前
    j.mp

    Have you ever considered about adding a little bit more than just your
    articles? I mean, what you say is valuable and everything.

    However imagine if you added some great images or video clips to give
    your posts more, "pop"! Your content is excellent but with pics and videos,
    this blog could undeniably be one of the best in its niche.
    Awesome blog!

  16. 3周前
    Ivybwp

    vardenafil generico online - vardenafil generic names buy generic vardenafil no prescription

  17. 3周前
    emurlep
  18. 3周前
    tinyurl.com

    Awesome site you have here but I was curious if you
    knew of any discussion boards that cover the
    same topics talked about in this article? I'd really like to be
    a part of community where I can get responses from other knowledgeable people that share the same interest.
    If you have any suggestions, please let me know. Thank you!

  19. 3周前
    Niuvoq

    buy stromectol - cost of stromectol ivermectin for people

  20. 3周前
    http://bitly.com/

    Excellent article. Keep posting such kind of info on your site.
    Im really impressed by your site.
    Hello there, You have done an incredible job.
    I will certainly digg it and for my part suggest to my friends.
    I am sure they will be benefited from this site.

  21. 3周前
    scoliosis surgery a

    Can I simply say what a comfort to uncover an individual who actually understands what they
    are talking about on the web. You certainly know how
    to bring a problem to light and make it important. A lot more
    people should read this and understand this side of your story.

    I can't believe you are not more popular given that you
    definitely have the gift.

loading