1
You have participated in the 10th anniversary "Q&A" of the SegmentFault Sifu community , and you are welcome to join.

Recently , "SegmentFault Sifu Community's 10th Anniversary "Q&A" check -in is very popular, but there is a small problem. I often don't know whether today's KPIs are completed, and whether the small tail is added normally, so let's make a small tool today.

analyze

  1. Open the personal Q&A page first
  2. Find out if there is an interface for pulling Q&A data separately . ( Thanks to the official boss for the optimization in the later stage, there is a direct interface )
    image.png
  3. Right-click copy as fetch and we can use it happily
    image.png

Retrofit & Recycle

Transform to find all data and filter out uninteresting information ( graphql is better, but unfortunately not )

 getAnswers = function(username, page = 1, startTime = new Date('2022-06-01 00:00:00.000').getTime() / 1000){
    return fetch(`https://segmentfault.com/gateway/homepage/${username}/answers?size=20&page=${page}&sort=newest`)
        .then(v=>v.json())
        .then(v=>v.rows)
        .then(async v=>{
            if(v.length === 20 && (v[v.length - 1]?.created || 0) > startTime){
                return v.concat(await getAnswers(username, page + 1, startTime))    
            }else{
                return v.filter(v=>v.created > startTime)
            }

        })
        // .then(console.log)
}
list = [];
getAnswers('linong')
    .then(console.log)

// new Date(1655005451 * 1000).toLocaleString();
// new Date('2022-06-01 00:00:00.000').getTime()

image.png

Just for learning, don't break the law!

View several active users

 await getAnswers('hfhan')
    .then(console.log)
await getAnswers('jamesfancy')
    .then(console.log)
await getAnswers('nickw_cn')
    .then(console.log)
await getAnswers('xdsnet')
    .then(console.log)

We will find that here username seems to be a fixed value, which is different from the username, so we are doing a url extraction, so that we don't have to manually select it

'https://segmentfault.com/u/jamesfancy/answers'.match(/\/u\/([^/]+)/)[1]

image.png

Analyze how to get whether there is a small tail

Because it is not from graphql, there are only so many answers to the above content. If we want to check the addition status of small tails, we need to do another collection.

By looking at it, it seems that there is no interface exposed, so we can only process html data directly.

 xhr = new XMLHttpRequest()
xhr.open('get', 'https://segmentfault.com/q/1010000041964562/a-1020000041964682')
xhr.responseType = 'document'
xhr.send();
xhr.onload = () => console.log(xhr.response, xhr.response.querySelector('[id="1020000041964682"] [href^="https://segmentfault.com/a/1190000041925107"]'))

In this way, we can use the selector to directly judge whether the answer contains the feature value . Guess why I used xhr here instead of fetch?

url extract id

'https://segmentfault.com/q/1010000041964562/a-1020000041964682'.match(/\/a-(\d+)$/)[1]

image.png

Retrofit cycle

 getAnswers = function(username, page = 1, startTime = new Date('2022-06-01 00:00:00.000').getTime() / 1000){
    return fetch(`https://segmentfault.com/gateway/homepage/${username}/answers?size=20&page=${page}&sort=newest`)
        .then(v=>v.json())
        .then(v=>v.rows)
        .then(async v=>{
            if(v.length === 20 && (v[v.length - 1]?.created || 0) > startTime){
                return v.concat(await getAnswers(username, page + 1, startTime))    
            }else{
                return v.filter(v=>v.created > startTime)
            }

        })
        // .then(console.log)
}
checkAnswerExt = async function(url){
    const id = url.match(/\/a-(\d+)$/)[1];
    
    var xhr = new XMLHttpRequest()
    xhr.open('get', `https://segmentfault.com${url}`)
    xhr.responseType = 'document'
    xhr.send();
    return new Promise(function(resolve, reject){
        xhr.onload = () => 
            resolve(
                xhr
               .response
               .querySelector(`[id="${id}"] [href^="https://segmentfault.com/a/1190000041925107"]`)
            )
    })
}
answers = [];
await getAnswers('cowcomic')
    .then(async function(list){
        for(var i = 0; i < list.length; i++){
            answers.push({
                answers: list[i],
                checked: await checkAnswerExt(list[i].url)
            })
        }
    })
    // .then(console.log)
answers

image.png

It can be found that there is indeed no small tail, so make up for it.

analyze data

In this way, we have all the data that we care about in the personal question and answer. Next, we can groupBy to find out all the data without small tails, and then fill in the data.

You can also view daily completions by time dimension

You're done. Finally, I would like to remind Yiju, it is only for learning, do not break the law!


linong
29.2k 声望9.5k 粉丝

Read-Search-Ask