Python爬虫英语老师招聘数据

此Python项目主要爬取外籍英语老师招聘数据988条，英语老师招聘数据6242条，并分析外教教师的招聘状况.

立即下载

应用介绍

import requests
import pandas as pd
from lxml import etree

class TeachInChina(object):
    def __init__(self, max_page):
        self.start_urls = ['http://www.jobleadchina.com/job?job_industry=Teaching' \
                           '&company_name=&page={}'.format(page) for page in range(1, max_page+1)]
    def get_data(self):
        for url in self.start_urls:
            res = requests.get(url)
            page = url.split('=')[-1]
            self.parse_data(res, page)
            print('成功爬取并保存第{}页数据!'.format(page))
    @staticmethod
    def parse_data(res, page):
        if res.status_code == 200:
            parsed = etree.HTML(res.text)
            title = parsed.xpath('//*[@class="positionTitle"]/a/text()')
            link = parsed.xpath('//*[@class="positionTitle"]/a/@href')
            salary = [slr.strip() for slr in parsed.xpath('//*[@class="salaryRange"]/text()')]
            company = parsed.xpath('//*[@class="companyName"]/a/text()')
            area = parsed.xpath('//*[@class="jobThumbnailCompanyIndustry"]/span[3]/text()')
            update_time = parsed.xpath('//*[@class="post-time"]/text()')
            exp_title = parsed.xpath('//*[@class="jobThumbnailPositionRequire"]/span[3]/text()')
            education = parsed.xpath('//*[@class="jobThumbnailPositionRequire"]/span[1]/text()')
            com_type = parsed.xpath('//*[@class="jobThumbnailCompanyIndustry"]/span[1]/text()')
            data = pd.DataFrame({'title': title, 'link': link, 'salary': salary,
                                 'company': company, 'area': area, 'update_time': update_time,
                                 'exp_title': exp_title, 'education': education,
                                 'com_type': com_type})
            if page == '1':
                data.to_csv('jobleadchina.csv', index=False, mode='a', header=True)
            else:
                data.to_csv('jobleadchina.csv', index=False, mode='a', header=False)
        else:
            print('链接{}请求不成功!'.format(res.url))

if __name__ == '__main__':
    job = TeachInChina(96)
    job.get_data()

此项目主要爬取外籍英语老师招聘数据822条，英语老师招聘数据6242条，并分析外教教师的招聘状况.

适应于Python数据分析学习者、Python爬虫学习者、Pandas使用者、数据可视化学习者

转载请注明出处： apollocode » Python爬虫英语老师招聘数据

文件列表(部分)

名称	大小	修改日期
data_gm.csv	11.95 KB	2019-12-30
jobleadchina.csv	18.06 KB	2019-12-30
jobleadchina.py	0.83 KB	2019-12-30
local_english_teacher.py	0.81 KB	2019-12-30
wechat_group_member.py	0.34 KB	2019-12-30
数据分析.ipynb	58.49 KB	2019-12-30
外语培训.csv	8.97 KB	2019-12-30
幼儿园.csv	5.81 KB	2019-12-30
职业院校.csv	1.27 KB	2019-12-30
中小学.csv	91.32 KB	2019-12-30

立即下载

本地下载

相关下载

[pyomo能源枢纽优化建模] 该存储库包含标准“能源枢纽”模型的简单实现，该模型用于单个分散式多能源系统（D-MES）的优化设计和操作，同时考虑建筑物改造选项；模型在Pyomo中构建。
[流畅的Python] 文档致力于帮助 Python 开发人员挖掘这门语言及相关程序库的优秀特性，避免重复劳动，同时写出简洁、流畅、易读、易维护，并且具有地道 Python 风格的代码。
[Flask Web开发：基于Python的Web应用开发实战] 此文档是Flask Web开发：基于Python的Web应用开发实战。此文档适合熟悉 Python 编程，有意通过 Flask 全面掌控 Web 开发的程序员学习参考。
[Effective Python.编写高质量Python代码的59个有效方法] 此文档是Effective Python.编写高质量Python代码的59个有效方法。文档中的各项条目，适用于Python3和Python2。对于Jython、IronPython等其他运行时环境，大部分条目应该同样适用。
[Python编程的核心知识点] 此文档是Python编程的核心知识点。文档中有函数一、函数二、基础知识1、基础知识2、面向对象编程、模块、数据类型、文件对象.........了解详情请下载附件。
[Python编程快速上手让繁琐工作自动化] 此文档是Python编程快速上手让繁琐工作自动化。文档的目的，不仅是介绍 Python语言的基础知识，而且还通过项目实践教会读者如何应用这些知识和技能。

评论列表共有 0 条评论

暂无评论