-
Recent Posts
Recent Comments
- 店小二 on 论三座大山
- L on SVI模型拟合
- acnkid on 俄乌战争的回顾及其他
- acnkid on 外卖小哥的收入以及其他
- acnkid on 外卖小哥的收入以及其他
Archives
- June 2024
- April 2024
- January 2024
- October 2023
- September 2023
- August 2023
- June 2023
- May 2023
- February 2023
- January 2023
- October 2022
- September 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- December 2021
- October 2021
- September 2021
- August 2021
- June 2021
- May 2021
- April 2021
- January 2021
- December 2020
- September 2020
- August 2020
- May 2020
- April 2020
- May 2019
- March 2019
- February 2019
- January 2019
- September 2018
- July 2018
- May 2018
- April 2018
- December 2017
- November 2017
- August 2017
- July 2017
- May 2017
- April 2017
- March 2017
- February 2017
- January 2017
- December 2016
- September 2016
- August 2016
- July 2016
- June 2016
- March 2016
- January 2016
- October 2015
- July 2015
- May 2015
- November 2014
- August 2014
- July 2014
- March 2014
- February 2014
- October 2013
- September 2013
- August 2013
- July 2013
- June 2013
- May 2013
- March 2013
- January 2013
- December 2012
- September 2012
- August 2012
- July 2012
- June 2012
- May 2012
- April 2012
- March 2012
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- September 2011
- August 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- December 2010
- November 2010
- October 2010
- September 2010
- August 2010
- June 2010
- May 2010
- October 2009
- July 2009
- May 2009
- March 2009
- February 2009
- January 2009
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- April 2008
- November 2007
- April 2007
Categories
Meta
Category Archives: 学术研究
百度指数的提取
最近一个多月,我一直在想做一个题目,百度指数和股价的关系。前几天刚刚把程序弄好,可以把指数提取出来的时候,在The Journal of Finance上面的 Forthcoming Article Abstract,找到了这篇In Search of Attention。 几乎是和我一样的想法,但是做的工作显然比我能做的更多,更好。顿时自己的心凉了半截,继续做下去的冲动都没了。不过今天和导师交流了一下,说你数据都弄好了,不在做做实在是有点可惜,那么我就当练手,抱着学习面板分析的态度继续搞下去吧。 不过虽然我这个题目已经被人做掉了,但是百度指数(反应了别人百度某关键词的次数)和社会科学的关系,还有许多其他课题可以做,因为他其实反应的是老百姓的注意力。通常我们很难将其量化,用这个代理变量,我觉得可以做不少分析。国外已经开始用社交网络的数据分析了,我们这边见到文献的倒是不多。不过国内外被广泛认可的东西我就没有见到了,刚刚那篇文章还在预发表嘛。很多以前的理论,因为没有数据,很难做分析,现在用这种搜索数据,或者社交网络数据,可以做的东西就很多了。 如果百度指数开发api,能做的东西就更多了。自己也有点像利用微博和社交网络开发的api,做一些数据的提取分析,我觉得这个也很有意思,虽然国外已经开始不少人在开始做,国内似乎见过一些公司在做这个,好像杜子健就是一个。 好了,闲话不提,说说百度指数的提取吧。我用的是matlab的图像处理,提取图像的曲线。效果如下: 原图(百度指数页面点击生成图片): 提取后的的效果: 上图绿色的就是提取后数据绘制的线条,和原来的线条基本是重合的。我把提取后的数据和原始数据做了一个比较,最大的误差在3%左右,还是可以让人满意的。 本来想把matlab程序也在这里提供了,发现自己居然再这个空间上面不能在建立独立页面了,那就算了吧。