python 크롤링 : csv 파일 만들기

Jonah's Whale 2018. 11. 8. 19:38

2018. 11. 8. 19:38

빵떡로그를 보고 응용

http://db-log.tistory.com/entry/32-%ED%81%AC%EB%A1%A4%EB%A7%81%ED%95%9C-%EB%8D%B0%EC%9D%B4%ED%84%B0%EB%A5%BC-csv-%ED%8C%8C%EC%9D%BC-%EB%A7%8C%EB%93%A4%EA%B8%B0?category=766620

완전체

import requests
from bs4 import BeautifulSoup
import csv

def mnet_Crawling(html):
    temp_list = []
    temp_dict = {}

    tr_list = html.select('div.MnetMusicList.MnetMusicListChart > div.MMLTable.jQMMLTable > table > tbody > tr')

    for tr in tr_list:
        rank = tr.find('td', {'class': 'MMLItemRank'}).find('span').text.strip('위')
        artist = tr.find('a', {'class': 'MMLIInfo_Artist'}).text
        title = tr.find('a', {'class':'MMLI_Song'}).text
        album = tr.find('a', {'class':'MMLIInfo_Album'}).text
        img = tr.find('div',{'class':'MMLITitle_Album'}).find('img').get('src')

        temp_list.append([rank, img, title, artist, album])
        temp_dict[rank] = {'img': img, 'title': title, 'artist': artist, 'album': album}
    return temp_list, temp_dict


#============================================================= End of mnet_Crawling() ===============================#
def toCSV(mnet_list):
    file = open('mnet_chart.csv', 'w', encoding='utf-8', newline='')
    csvfile = csv.writer(file)
    for row in mnet_list :
        csvfile.writerow(row)
    file.close()
#============================================================ End of toCSV() ========================================#

mnet_list = []
mnet_dict = {}

for page in [1, 2]:
    req = requests.get('http://www.mnet.com/chart/TOP100/?pNum={}'.format(page))
    html = BeautifulSoup(req.text, 'html.parser')

    mnet_list += mnet_Crawling(html)[0]                    # 0 1 이렇게 표시한 이유는  ( [] , {} ) 이런 개념이라서 그런듯
    mnet_dict = dict(mnet_dict, **mnet_Crawling(html)[1])

# 리스트 출력
for item in mnet_list:
    print(item)

# 사전형 출력
for item in mnet_dict:
    print(item, mnet_dict[item]['img'], mnet_dict[item]['title'], mnet_dict[item]['artist'], mnet_dict[item]['album'])

# CSV파일 생성
toCSV(mnet_list)

1. csv모듈 import

import csv

2. toCSV()함수 생성

# CSV파일 생성

toCSV(mnet_list)

def toCSV(mnet_list):
    file = open('mnet_chart.csv', 'w', encoding='utf-8', newline='')
    csvfile = csv.writer(file)
    for row in mnet_list :
        csvfile.writerow(row)
    file.close()

file =	open	'mnet_chart.csv'	'w'	encoding='utf-8'	newline=''
변수	파일생성하는함수	생성하는 파일명	write쓰기모드	오류날경우	안하면 마지막에 공백생김

encoding = 'utf-8'의 경우,

에러가 안나는 경우도 있지만, 사용하는 에디터에 따라서 에러가 나기도합니다.

아래와 같은 에러가 나는 경우에 넣으면됩니다.

UnicodeEncodeError: 'ascii' codec can't encode characters in position 64-67: ordinal not in range(128)

newline = ''의 경우,

csv파일을 만들때, 각각의 라인뒤에 빈라인이 생기는데, 이를 제거하는 옵션입니다.

Tip

1
2
3
4
5
6
def toCSV(mnet_list):
    file = open('mnet_chart.csv', 'w', encoding='utf-8', newline='')
    csvfile = csv.writer(file)
    for row in mnet_list: 
        csvfile.writerow(row)
  
cs

대신에 ↓↓↓ 아래와같이 쓸 수 있습니다.

file.close()를 자동으로 해주기 때문에 편리합니다.

def toCSV(mnet_list):
    with open('mnet_chart.csv', 'w', encoding='utf-8', newline='') as file :
        csvfile = csv.writer(file)
        for row in mnet_list:
            csvfile.writerow(row)

출처: http://db-log.tistory.com/entry/32-크롤링한-데이터를-csv-파일-만들기?category=766620 [떡빵로그]

'코딩' 카테고리의 다른 글

Building a Complex Financial Chart with D3 and d3fc (0)	2018.11.09
프레임워크랑 라이브러리 차이 (0)	2018.11.08
d3.js csv 파일읽는 법 (1)	2018.11.08
python 크롤링 : 2 (0)	2018.11.07
Matplotlib Tutorials 10 11 12 13 (youtube sentdex) (0)	2018.11.07

commune

python 크롤링 : csv 파일 만들기

'코딩' 카테고리의 다른 글

+ Recent posts

티스토리툴바