Package 'N2H4'

Title: Handling Methods for Naver News Text Crawling
Description: Provides some functions to get Korean text sample from news articles in Naver which is popular news portal service <https://news.naver.com/> in Korea.
Authors: Chanyub Park [aut, cre]
Maintainer: Chanyub Park <[email protected]>
License: MIT + file LICENSE
Version: 0.8.4
Built: 2024-11-21 03:50:07 UTC
Source: https://github.com/forkonlp/N2H4

Help Index


Get All Comment

Description

Get all comments from the provided news article url on naver

Usage

getAllComment(turl)

Arguments

turl

character. News article on 'Naver' such as <https://n.news.naver.com/mnews/article/023/0003712918>. News article url that is not on Naver.com domain will generate an error.

Details

Works just like getComment, but this function executed in a fashion where it finds and extracts all comments from the given url.

Value

a [tibble][tibble::tibble-package]

Examples

## Not run: 
  getAllComment("https://n.news.naver.com/mnews/article/214/0001195110")
  
## End(Not run)

Get All Comment History

Description

Get All Comment History

Usage

getAllCommentHistory(turl, commentNo)

Arguments

turl

character. News article on 'Naver' such as <https://n.news.naver.com/mnews/article/001/0009205077?sid=102>. News articl url that is not on Naver.com domain will generate an error.

commentNo

Parent Comment No.

Value

a [tibble][tibble::tibble-package]

Examples

## Not run: 
  getAllComment("https://n.news.naver.com/mnews/article/214/0001195110?sid=103")
  
## End(Not run)

News Category

Description

News Category

Usage

getCategory(fresh = FALSE)

Arguments

fresh

get data from online. Default is FALSE using cached built-in data.


Get Comment

Description

Get naver news comments. if you want to get data only comment, enter command like below. getComment(url)$result$commentList[[1]]

Usage

getComment(turl, count = 10, type = c("df", "list"))

Arguments

turl

like <https://n.news.naver.com/mnews/article/023/0003712918>.

count

is a number of comments. Defualt is 10. "all" works to get all comments.

type

type return df or list. Defualt is df. df return part of data not all.

Value

a [tibble][tibble::tibble-package]

Examples

## Not run: 
  getComment("https://n.news.naver.com/mnews/article/421/0002484966?sid=100")

## End(Not run)

Get Comment History

Description

Get naver news comments on user histories.

Usage

getCommentHistory(turl, commentNo, count = 10, type = c("df", "list"))

Arguments

turl

character. News article on 'Naver' such as <https://n.news.naver.com/mnews/article/001/0009205077?sid=102>. News articl url that is not on Naver.com domain will generate an error.

commentNo

Parent Comment No.

count

is a number of comments. Defualt is 10. "all" works to get all comments.

type

type return df or list. Defult is df. df return part of data not all.

Value

a [tibble][tibble::tibble-package]

Examples

## Not run: 
  cno <- getComment("https://n.news.naver.com/mnews/article/421/0002484966?sid=100")
  getCommentHistory("https://n.news.naver.com/mnews/article/421/0002484966?sid=100",
    cno$commnetNo[1])

## End(Not run)

Get Content

Description

Get naver news content from links.

Usage

getContent(
  turl,
  col = c("url", "original_url", "section", "datetime", "edittime", "press", "title",
    "body")
)

Arguments

turl

is naver news link.

col

is what you want to get from news. Defualt is all.

Value

a [tibble][tibble::tibble-package]

Examples

## Not run: 
  getContent("https://n.news.naver.com/mnews/article/214/0001195110?sid=103")
  
## End(Not run)

Get News Main Categories

Description

Get naver news main category names and ids recently.

Usage

getMainCategory()

Value

a [tibble][tibble::tibble-package]

Examples

## Not run: 
  getMainCategory()
  
## End(Not run)

Get News Sub Categories

Description

Get naver news sub category names and urls recently.

Usage

getSubCategory(sid1 = 100)

Arguments

sid1

Main category id in naver news url. Only 1 value is passible. Default is 100 means Politics.

Value

a [tibble][tibble::tibble-package]

Examples

## Not run: 
  getSubCategory(100)
  getSubCategory(100, FALSE)
  
## End(Not run)