My direction
My teaching course range from Software Engineering and Network Engineering.
@article{tang2022long, title={Long text feature extraction network with data augmentation}, author={Tang, Changhao and Ma, Kun and Cui, Benkuan and Ji, Ke and Abraham, Ajith}, journal={Applied Intelligence}, pages={1--16}, year={2022}, publisher={Springer} }
为了有效检测移动端的未知恶意软件,提出一种基于机器学习算法,并结合提取的具有鲁棒性的网络流量统计特征,训练出具有未知移动恶意网络流量识别能力的检测模型;该模型主要包括Android恶意软件样本数据预处理、网络流量数据自动采集以及机器学习检测模型训练;通过对不同时间节点的零日恶意软件检测的实验,验证模型的有效性。结果表明,所提出的方法对未知恶意样本的检测精度可以超过90%,并且F度量值为80%。
@article{李浩2019基于网络流量分析的未知恶意软件检测, title={基于网络流量分析的未知恶意软件检测}, author={李浩 and 马坤 and 陈贞翔 and 赵川}, journal={济南大学学报(自然科学版)}, number={6}, year={2019}, }
针对自然语言处理的文本情感分类问题,提出一种基于集成学习的文本情感分类方法;基于微博数据的特殊性,首先对微博数据进行分词等预处理,结合词频-逆文档频率(TF-IDF)和奇异值分解(SVD)方法进行特征提取和降维,再通过堆叠泛化(stacking)集成学习的方式进行分类模型融合。结果表明,模型融合对文本情感分析的准确率达到93%,可以有效地判别微博文本的情感极性。
@article{段吉东2019基于集成学习的文本情感分类方法, title={基于集成学习的文本情感分类方法}, author={段吉东 and 刘双荣 and 马坤 and 孙润元}, journal={济南大学学报(自然科学版)}, year={2019}, }
Social network services for self-media, such as Weibo, Blog, and WeChat Public, constitute a powerful medium that allows users to publish posts every day. Due to insufficient information transparency, malicious marketing of the Internet from self-media posts imposes potential harm on society. Therefore, it is necessary to identify news with marketing intentions for life. We follow the idea of text classification to identify marketing intentions. Although there are some current methods to address intention detection, the challenge is how the feature extraction of text reflects semantic information and how to improve the time complexity and space complexity of the recognition model. To this end, this paper proposes a machine learning method to identify marketing intentions from large-scale We-Media data. First, the proposed Latent Semantic Analysis (LSI)-Word2vec model can reflect the semantic features. Second, the decision tree model is simplified by decision tree pruning to save computing resources and reduce the time complexity. Finally, this paper examines the effects of classifier associations and uses the optimal configuration to help people efficiently identify marketing intention. Finally, the detailed experimental evaluation on several metrics shows that our approaches are effective and efficient. The F1 value can be increased by about 5%, and the running time is increased by 20%, which prove that the newly-proposed method can effectively improve the accuracy of marketing news recognition.
@Article{fi11070155, AUTHOR = {Wang, Yufeng and Liu, Shuangrong and Li, Songqian and Duan, Jidong and Hou, Zhihao and Yu, Jia and Ma, Kun}, TITLE = {Stacking-Based Ensemble Learning of Self-Media Data for Marketing Intention Detection}, JOURNAL = {Future Internet}, VOLUME = {11}, YEAR = {2019}, NUMBER = {7}, ARTICLE-NUMBER = {155}, URL = {https://www.mdpi.com/1999-5903/11/7/155}, ISSN = {1999-5903}, ABSTRACT = {Social network services for self-media, such as Weibo, Blog, and WeChat Public, constitute a powerful medium that allows users to publish posts every day. Due to insufficient information transparency, malicious marketing of the Internet from self-media posts imposes potential harm on society. Therefore, it is necessary to identify news with marketing intentions for life. We follow the idea of text classification to identify marketing intentions. Although there are some current methods to address intention detection, the challenge is how the feature extraction of text reflects semantic information and how to improve the time complexity and space complexity of the recognition model. To this end, this paper proposes a machine learning method to identify marketing intentions from large-scale We-Media data. First, the proposed Latent Semantic Analysis (LSI)-Word2vec model can reflect the semantic features. Second, the decision tree model is simplified by decision tree pruning to save computing resources and reduce the time complexity. Finally, this paper examines the effects of classifier associations and uses the optimal configuration to help people efficiently identify marketing intention. Finally, the detailed experimental evaluation on several metrics shows that our approaches are effective and efficient. The F1 value can be increased by about 5%, and the running time is increased by 20%, which prove that the newly-proposed method can effectively improve the accuracy of marketing news recognition.}, DOI = {10.3390/fi11070155} }
This article presents software library for the Arduino platform which significantly improves the speed of the functions for digital input and output. This allows the users to apply these functions in whole range of applications, without being forced to resort to direct register access or various 3rd party libraries when the standard Arduino functions are too slow for given application. The method used in this library is applicable also to other libraries which aim to abstract the access to general purpose pins of a microcontroller.
@article {YuToward2015, title={Toward Core Point Evolution Using Water Ripple Model}, author={Zhibing Yu and Kun Ma}, journal={WSEAS Transactions on Computers}, pages={819-825}, year={2015}, volume={14}, number={Art. #79}}
针对重复数据检测过程中增量数据重复值检测问题进行分析,在基本近邻排序算法基础上,提出增量近邻排序比较算法。该算法通过跳动窗口形式比较相邻数据,大大减少了数据比较次数;同时引入MapReduce模型对该算法加以改进以提高其海量数据处理的能力。实验表明,改进后的增量近邻排序比较算法在保证检则结果准确的前提下,能够有效提高增量数据重复检测的速度,并且算法具有较高的稳定性,更适应海量数据环境中重复数据检测任务。
@article{董富森2015mapreduce, title={MapReduce 模型下增量重复数据检测方法}, author={董富森 and 杨波 and 马坤 and 王文华}, journal={济南大学学报 (自然科学版)}, volume={4}, pages={001}, year={2015}}
Maintaining data indexes and query cache becomes the bottleneck of the database, especially in the context of frequently updated data. In order to reduce the burden of the database, a cache system for frequently updated data has been proposed in this paper. In the system, update statements are parsed firstly. Then updated data are saved as key-value pairs in the cache and they are synchronized into the database at idle time. Experimental results show that the proposed cache system cannot only accelerate the data updating rate, but also improve the data writing ability in maintaining indexes and consistency of cache data greatly.
@article {DongCache2015, title={Cache System for Frequently Updated Data in the Cloud}, author={Fusen Dong and Kun Ma and Bo Yang}, journal={WSEAS Transactions on Computers}, pages={163-170}, year={2015}, volume={14}, number={Art. #17}}
Content syndication is the process of pushing the information out into third-party information providers. The idea is to drive more engagement with your content by wiring it into related digital contexts. However, there are some shortages of current related products, such as search challenges on massive feeds, synchronization performance, and user experience. To address these limitations, we aim to propose an improved architecture of content syndication and recommendation. First, we design a source listener to extract feed changes from different RSS sources, and propagate the incremental changes to target schema-free document stores to improve the search performance. Second, the proposed recommendation algorithm is to tidy, filter, and sort all the feeds before pushing them to the users automatically. Third, we provide some OAuth2-authorization RESTful feed sharing APIs for the integration with the third-party systems. The experimental result shows that this architecture speeds up the search and synchronization process, and provides friendlier user experience.
@article {TangRSSCube2014, title={RSSCube: A Content Syndication and Recommendation Architecture}, author={Zijie Tang and Kun Ma}, journal={International Journal of Database Theory and Application}, pages={237-248}, year={2014}, volume={7}, number={4}}
In the detection of fake news, the stance of comments usually contains evidence supporting false news that can be used to corroborate the detected results of the fake news. However, due to the misleading content of fake news, there is also the possibility of fake comments. By analyzing the position of comments and considering the falseness of comments, comments can be used more effectively to detect fake news. In response to this problem, we proposed Bipolar Argumentation Frameworks of Reset Comments Stance (BAFs-RCS) and Average Parameter Aggregation of Comments (APAC) to use the stance of comments to correct the prediction results of the Roberta model. We use the Fakeddit dataset for experiments. Our macro-F1 results on 2way and 3way are improved by 0.0029 and 0.0038 compared to the baseline RoBERTa model's macro-F1 results at Fakeddit dataset. The results show that our method can effectively use the stance of comments to correct the results of model prediction errors.
在线问答社区(Community Question Answering, CQA)已经成为互联网最重要的知识分享交流平台,将用户提出的海量问题有效推荐给可能解答的用户,挖掘用户感兴趣的问题是此类平台最核心功能。一些针对问答社区的专家推荐算法已经被提出用来提高平台解答效率,但是现有工作大多关注于用户兴趣与问题信息匹配,忽视了用户兴趣动态变化问题,可能会严重影响推荐质量。本文提出了结合注意力与循环神经网络的专家推荐算法,不仅实现了问题信息的深度特征编码,而且还能捕获动态变化的用户兴趣。首先,问题编码器在预训练词嵌入基础上结合CNN卷积神经网络和Attention注意力机制实现了问题标题与绑定标签的深度特征联合表示。然后,用户编码器在用户历史回答问题的时间序列上利用长短期记忆神经网络Bi-GRU模型捕捉动态兴趣,并结合用户固定标签信息表征长期兴趣。最后,根据两个编码器输出向量的相似性计算产生用户动态兴趣与长期兴趣相结合的推荐结果。我们在来自于知乎问答社区的真实数据上进行了不同参数配置及不同算法的对比实验,表明该算法性能要明显优于目前比较流行的深度学习专家推荐算法。
In the context of natural language processing, accuracy of intention detection is the basis for subsequent research on human-machine speech interaction. However, the problem of ambiguity in word vectors reduces the accuracy of intent detection. Meantime, there is a disconnection between local features and global features as well, resulting in text feature extraction that cannot fully reflect semantic information. These issues are all barriers of intention detection. Therefore, this paper proposes an attention-based convolutional neural network for self-media data learning (called A-CNN) for marketing intention. We cascade the traditional CNN with the self-attention model in the Attention networks to form a new network structure called A-CNN, and put forward a fast feature extraction method based on skip-gram-based learning called FSLText, to represent the high-dimension word vectors in the A-CNN. On the premise of maintaining the advantages of the CNN, A-CNN can not only solve the problem of local and global features disconnection caused by the CNN pooling layer, but also avoid the increase of algorithm complexity. The Self-Attention mechanism in the Attention model can effectively optimize the weight of local features of the information in global features, and retain local features that are more useful for intention detection. A fast feature extraction method which is based on Skip-gram can retain the semantic and word order information of the text. The method is beneficial to the marketing intention detection. According to the experiment, our A-CNN, compared with traditional machine learning methods, can improve 12.32% accuracy. Contrast to the dual-channel CNN, the accuracy rate is improved by 9.68%, and compared with the ATT-CNN, it is improved by 9.97%. On the F1 score, the A-CNN can improve the F1 score by about 9.37% in comparison with the traditional machine learning methods, the accuracy rate is increased by 9.68% compared with the dual-channel CNN, and 9.6
@article{HOU2021104118, title = {Attention-based learning of self-media data for marketing intention detection}, journal = {Engineering Applications of Artificial Intelligence}, volume = {98}, pages = {104118}, year = {2021}, issn = {0952-1976}, doi = {https://doi.org/10.1016/j.engappai.2020.104118}, url = {https://www.sciencedirect.com/science/article/pii/S0952197620303572}, author = {Zhihao Hou and Kun Ma and Yufeng Wang and Jia Yu and Ke Ji and Zhenxiang Chen and Ajith Abraham}, keywords = {Marketing intention detection, Attention model, Convolutional neural network, Feature extraction}, abstract = {In the context of natural language processing, accuracy of intention detection is the basis for subsequent research on human-machine speech interaction. However, the problem of ambiguity in word vectors reduces the accuracy of intent detection. Meantime, there is a disconnection between local features and global features as well, resulting in text feature extraction that cannot fully reflect semantic information. These issues are all barriers of intention detection. Therefore, this paper proposes an attention-based convolutional neural network for self-media data learning (called A-CNN) for marketing intention. We cascade the traditional CNN with the self-attention model in the Attention networks to form a new network structure called A-CNN, and put forward a fast feature extraction method based on skip-gram-based learning called FSLText, to represent the high-dimension word vectors in the A-CNN. On the premise of maintaining the advantages of the CNN, A-CNN can not only solve the problem of local and global features disconnection caused by the CNN pooling layer, but also avoid the increase of algorithm complexity. The Self-Attention mechanism in the Attention model can effectively optimize the weight of local features of the information in global features, and retain local features that are more useful for intention detection. A fast feature extraction method which is based on Skip-gram can retain the semantic and word order information of the text. The method is beneficial to the marketing intention detection. According to the experiment, our A-CNN, compared with traditional machine learning methods, can improve 12.32% accuracy. Contrast to the dual-channel CNN, the accuracy rate is improved by 9.68%, and compared with the ATT-CNN, it is improved by 9.97%. On the F1 score, the A-CNN can improve the F1 score by about 9.37% in comparison with the traditional machine learning methods, the accuracy rate is increased by 9.68% compared with the dual-channel CNN, and 9.68% in contrast with ATT-CNN. It illustrates that our A-CNN can effectively address semantic and feature selection for marketing intention detection.} }
In the mobile computing environment, how to make the data access more efficient is a challenge due to the narrow communication bandwidth, the frequent disconnections of network, and the limited resources. Therefore, it is necessary to cache data on the client side. Besides, a good cache consistency method is essential to ensure the correctness. In this article, a row‐based semantic cache with incremental versioning consistency (RSCVC) is proposed. In RSCVC, we designed a semantic cache algorithm, a query trimming and optimizing algorithm, and a version‐based consistency strategy. This RSCVC cache mainly has two advantages. On one hand, it can obviously improve the response time of query and the hit ratio of the cache. On the other hand, the version‐based consistency enhances the stability of the system especially in high‐concurrency situations. Experiments demonstrate the efficacy of our proposed method and its superiority to state‐of‐the‐art methods.
@article{doi:10.1002/cpe.5672, author = {Yang, Zhe and Ma, Kun and Zhang, Xiaoli and Cui, Lizhen and Yang, Bo}, title = {RSCVC: Row-based semantic cache with incremental versioning consistency}, journal = {Concurrency and Computation: Practice and Experience}, volume = {n/a}, number = {n/a}, pages = {e5672}, keywords = {cache loading, cache penetration, cache snowslide, data consistency, query optimization, semantic cache}, doi = {10.1002/cpe.5672}, url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.5672}, eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.5672}, abstract = {Summary In the mobile computing environment, how to make the data access more efficient is a challenge due to the narrow communication bandwidth, the frequent disconnections of network, and the limited resources. Therefore, it is necessary to cache data on the client side. Besides, a good cache consistency method is essential to ensure the correctness. In this article, a row-based semantic cache with incremental versioning consistency (RSCVC) is proposed. In RSCVC, we designed a semantic cache algorithm, a query trimming and optimizing algorithm, and a version-based consistency strategy. This RSCVC cache mainly has two advantages. On one hand, it can obviously improve the response time of query and the hit ratio of the cache. On the other hand, the version-based consistency enhances the stability of the system especially in high-concurrency situations. Experiments demonstrate the efficacy of our proposed method and its superiority to state-of-the-art methods.} }
Aiming at this urgent need of the security protection of mobile intelligent terminal data file, we design a file security box to protect and manage important files in smartphones. The main functions of this file security box are: 1) fingerprint verification; 2) file management; 3) efficient encryption and decryption; 4) adaptive cipher algorithms; 5) separate logical document library; 6) updating the secret key regularly; 7) storing the key securely; 8) reinforcing the safety in enterprise level. Functional tests and performance tests show that the file security box we designed can not only achieve all of the above functionality with excellent user transparency and friendliness but also ensure the safety of important files without effecting users' experience.
@article{luo2017toward, title={Toward mobile smart data file protection box}, author={Luo, Tianren and Li, Xueyong and Ma, Kun and Luo, Xiaoying}, journal={International Journal of Autonomic Computing}, volume={2}, number={3}, pages={282--309}, year={2017}, publisher={Inderscience Publishers (IEL)} }
张家豪,自助点餐系统,2019
刘方涵,文献管理系统,2019
李松谦, 办公OA系统, 2018
瞿浩、杨哲, 济南大学官方网站, 2018
瞿浩土木建筑学院官方网站, 2018
瞿浩Jayce, 2018
瞿浩Programer Chrome Tab, 2018
瞿浩经英教育, 2018
瞿浩水墨人生商城, 2018
瞿浩, 校乡汇, 2016
李松谦2017年济南大学学工在线, 2018
李松谦、牛学蔚, 2017
李松谦2017届迎新系统, 2018
李松谦2017届学工在线纳新系统, 2018
李松谦2018年济南大学学工在线, 2018
杨哲, 山东大学车辆管理系统, 2017
杨哲, 济南大学官网, 2017
杨哲, 济南大学信息学院官网, 2017
杨哲, 大数据驱动创新方法工作平台, 2017
杨哲, 趣打印系统, 2017
牛学蔚, 晒米约拍平台, 2017
杨哲, 趣打印系统, 2017
姚树巍, 学生在线互助答疑系统, 2017
杨哲, 向素, 2016
杨哲, C.D.Cafe点餐系统, 微信号cdcafe_chin, 2016
杨哲, C.D.外卖系统-米优私厨, 微信号miyousichu, 2016
杨哲, 食全时美外卖, 微信号SQSMwaimai, 2016
杨哲, 以勒留学, 2016
杨哲, 土建学院在线手册, 2016
杨哲, 恒信微金CRM(北京玖富财富济南分部)测试版, 2016
杨哲, 吉林省镇赉县文化馆, 2016
纪笑难, 静态博客, 2016
纪笑难, 斗图网, 2016
纪笑难, 济南大学物业中心, 2016
纪笑难, 济南大学合作发展处, 2016
李昶昕, 新浪云CMS博客, 2016
瞿浩, 济南大学学工处, 2016
瞿浩, 基于Node.js的博客 Blog of Houser, 2016
瞿浩, About me, 2016
瞿浩, 济南大学土木建筑学院, 2016
瞿浩, 基于社交网络的社团管理服务平台, 2016
瞿浩, 基于社交网络的社团管理服务平台, 2016
Zhe Yang, Logistic Duty Management, 2015
Zhe Yang, Youth Literature, 2015
Zhe Yang, Student Online, 2014
Zhe Yang, USLab, 2014
Zhe Yang, Information Disclosure of UJN, 2014
Zhe Yang, Organization Department of UJN, 2013
Zhe Yang, Student Union of UJN, 2013
Zhe Yang, Yue Dong, 2015
Zhe Yang, Yue Qi, 2015
Zhe Yang, Sheng Shi, 2015
Zhe Yang, San Zhong, 2015
Zhe Yang, 988 Shopping, 2015
Zhe Yang, San Zhong, 2015
Zhe Yang, Blog of Zhe Yang, 2015
Zhe Yang, Internet Navigation, 2007
Zhe Yang, Zhongqi Data, 2010
Zhe Yang, Faxinbao, 2007
Zhe Yang, Lvtian, 2014
Zhe Yang, xiaocheng Blog, 2014
Zhe Yang, Jinxing, 2014
Zhe Yang, Dianti, 2014
Zhe Yang, Longao, 2014
Zhe Yang, Baihe, 2014
Zhe Yang, Jinmingtang, 2014
Zhe Yang, Lvwei, 2014
Zhe Yang, San Zhong, 2015
Zhe Yang, Dance Association of UJN, 2015
Zhe Yang, Logistic Management, 2014
Shuwei Yao, Online Courseware Management System, 2015
Zhe Yang and Shuwei Yao, Achievement Assistant, 2015
Zhe Yang and Shuwei Yao, purchase of second-hand unused goods, 2014
Zijie Tang, Youzi Fan, 2015
Zijie Tang, Information Youth of UJN, 2015
Zijie Tang, School of Political Science and Public Administration of UJN, 2015
Zijie Tang, Cultural Centre of UJN, 2015
Zijie Tang, UJNCMS, 2015
Zijie Tang, Student Union of UJN, 2015
Zijie Tang, DI JIANG, 2015
Zijie Tang, @ Me, 2013-2015 微电影【爱情概率论】
Zijie Tang, @ Me (ujn), 2013-2015
Zijie Tang, RSS Cube, 2013
Zijie Tang, UJN Facemash, 2013
Zijie Tang, Love Wall, 2014