Open Access Paper
24 May 2022 MUI-ISIDA: an improved calculation method for user influence in social networks
Shulin Cheng, Ziming Wang, Meng Qian, Shan Yang, Xin Zheng
Author Affiliations +
Proceedings Volume 12260, International Conference on Computer Application and Information Security (ICCAIS 2021); 1226027 (2022) https://doi.org/10.1117/12.2637483
Event: International Conference on Computer Application and Information Security (ICCAIS 2021), 2021, Wuhan, China
Abstract
With the popularization and development of the Internet, microblogs have become a mainstream social network platform. The evaluation of user influence has become a research hotspot. Most of the existing researches calculate influence by improving PageRank. But these researches ignored the relationship between users’ interest theme similarity and information dissemination, and didn’t have enough analysis about the interaction behaviors among users. Aiming at these problems, we proposed a new microblog user influence algorithm—MUI-ISIDA (microblog user influence based on interest similarity and information dissemination ability) in this paper, which takes into account users’ interest theme similarity and information dissemination ability. We verified the effectiveness of the proposed algorithm on Sina microblog dataset. The experimental results show that compared with PageRank and MR-UIRank, the proposed algorithm has achieved higher accuracy in user influence ranking.

1.

INTRODUCTION

In recent years, with the rapid development and popularization of the Internet, social network has played an increasingly important role in the disseminations of information, ideas, and influence, and it has become an important medium for users to obtain and exchange information. In China, microblog like Twitter has become one of the most extensive social network platforms due to its openness and content simplicity. Different behaviors among users make information spread rapidly, which form the microblog information dissemination network.

Influence refers to the ability of an individual to influence others, change their thoughts or behaviors1. While users receive and disseminate information, they are also in the process of influencing and being influenced. Users with different influences have different effects on the speed and scope of information dissemination. High-influence users in network can be used as seed nodes to maximize the spread of influence2, and play active roles in many fields. Therefore, it’s of practical significance to measure users’ influences in social networks.

In order to achieve page ranking, Larry Page and Sergey Brin proposed the PageRank algorithm3. The algorithm has also been used to analyze microblog users’ influences, but it has certain limitations. Therefore, many experts and scholars have proposed some new algorithms on the basis of PageRank. Inspired by previous studies, we proposed MUI-ISIDA algorithm, which is comprehensive and reasonable to calculate uses’ influences by considering multiple factors, such as user interests, relationships between influence and theme4, user information dissemination ability. The results are reasonably ranked and experiments have verified the effectiveness of the algorithm.

2.

RELATED CONCEPTS AND METHODS

2.1

Microblog network

The ‘following’ relationships among users form the information dissemination network. The interactive behaviors among users, such as forwarding, commenting, etc., promoted the information spread again5. We selected the three behaviors of following, forwarding and commenting to construct the microblog network G<V, E>. V indicates the set of nodes and E represents the set of edges in the network. As shown in Figure 1, the nodes in the figure represent microblog users, and the edges represent certain relationships among users. If user A follows user B, forwards or comments on his/her microblog posts, which implies that user A has interacted with user B. There is an edge from A to B, and the weight of the edge is the closeness of the connection.

Figure 1.

Microblog network.

00246_psisdg12260_1226027_page_2_1.jpg

2.2

PageRank

PageRank uses the link structures among web pages to calculate the qualities of web pages, and uses a directed graph to represent the relationships among web pages. Each node represents a web page. And each edge represents a link between web pages in the graph. It is calculated as below:

00246_psisdg12260_1226027_page_2_2.jpg

where M(pi) stands for the set of all pages linked to pi. L(pj) represents the number of all external links of the web page pj. α is the damping factor, which represents the probability that the page is randomly accessed, and usually taking an empirical value of 0.85.

3.

MAIN IDEA OF MUI-ISIDA

In this part, we took the interest theme similarities among users into account, and introduced the user information dissemination ability as a weighting factor in PageRank, finally constructed the MUI-ISIDA algorithm model.

3.1

Interest similarity

In microblog network, users often pay time and effort to publish microblog posts, so their historical original microblog posts can directly represent their own interests and hobbies6. We constructed the users’ interest theme similarity model for calculating interest similarity. Since it is to identify topics that each user is interested in, rather than the topic of each microblog post is about. We integrated all original microblog posts published by the same user within a specified period of time into one document.

We chose the LDA model for topic mining. The LDA model, which treats each document as a ‘bag of words’, so each document emerges as a probability distribution over some topics7. We use this model and normalize it to get the document-topic distribution. And the result is represented by a matrix DT, which is denoted as DT.

00246_psisdg12260_1226027_page_2_3.jpg

The interest similarities among users can be calculated by the users’ topic distributions through Pearson correlation coefficient, and are described by equation (2).

00246_psisdg12260_1226027_page_3_1.jpg

where DTij stands for the probability of the word, which is assigned to the topic j in microblog posts of user i. Sim < i, k > denotes the interest similarities among users. T represents the set of all topics in users’ original microblog posts. 00246_psisdg12260_1226027_page_3_2.jpgdenotes the mean interest degree of user i on these topics.

3.2

Information dissemination ability

In this part, we took the microblog quality coefficient and the assimilation effect coefficient as the evaluation index of the information dissemination ability.

3.2.1

Microblog quality coefficient.

When a microblog post is forwarded or commented more frequently, it represents that the user who published it has higher influence. Users tend to share their own thoughts and participate in the discussion on the microblog posts they are interested in. The behaviors of forwarding or commenting embody the interests of the user are often closely related to the original microblog contents. Therefore, we can integrate these microblog posts and mine the latent topics to find the relevance to the original microblog contents. When the degree of relevance is higher, which implies that other users participated in the interaction with the original content more effectively, and the user’s microblog contents are of higher quality.

Therefore, we integrated all original microblog posts by the same user into a document Mp, and integrated these thoughts related to the original contents into a document Mc. The topic probability distributions of the two documents θp (k) and θc (k) are generated through LDA model. KL distance8 is usually used to measure the difference between the two probability distributions, which can be given by equation (3).

00246_psisdg12260_1226027_page_3_3.jpg

The JS divergence9 is a smoother and symmetrical probability distribution measurement based on the KL divergence. Inspired by the JS divergence, we transform the KL divergence equation and use equation (4) to symmetrize the KL divergence.

00246_psisdg12260_1226027_page_3_4.jpg

When DKL (θc, θp) gets greater value, which indicates that the similarity between Mp and Mc is smaller. There may exist some worthless communications or invalid forwarding in the user’s microblog posts. Consequently, the quality coefficient should be appropriately reduced. So the effective factor δ is defined as:

00246_psisdg12260_1226027_page_3_5.jpg

The microblog quality coefficient is expressed as follows:

00246_psisdg12260_1226027_page_3_6.jpg

where Qk represents quality coefficient of user k. Nk stands for the total number of microblog posts published by user k. Rk, Ck denotes the number of microblog posts has been forwarded and commented of user k respectively. δ is the introduced effective coefficient.

3.2.2

Assimilation effect coefficient.

Users can browse the contents of bloggers’ microblog posts and they tend to forward or comment a microblog post they are interested in. Their interests and hobbies will also be imperceptibly affected during the process. The greater proportion of forwarding and commenting a user has, the user’ s interests are more easily assimilated by others10.

The user assimilation effect coefficient is shown in equation (7).

00246_psisdg12260_1226027_page_4_1.jpg

where rk, ck, and nk indicate the number of forwarding, commenting and all microblog posts for user k, respectively. Sk stands for the assimilation effect coefficient of user k.

3.2.3

Information dissemination ability.

The spread of information mainly through the forwarding and commenting among users. The larger the microblog quality coefficient and assimilation effect coefficient of users, the stronger the ability of users to control information transmission, which is conducive to information transmission in the microblog social network. Therefore, we defined the information dissemination ability of user k as Wk. The expression is shown as below:

00246_psisdg12260_1226027_page_4_2.jpg

3.3

MUI-ISIDA algorithm model

The basic idea of the MUI-ISIDA algorithm is to take the user’s information dissemination ability as user’s own weight. And according to the similarities among users, reasonably distribute the contribution of followers to the influence of bloggers. The more similar the users’ interest theme, the greater the mutual influence among users. Therefore, the MUI-ISIDA value is computed as:

00246_psisdg12260_1226027_page_4_3.jpg

where f(i) is the set of followers of user i. φ(j, i) is the rate of MUI-ISIDA value that user j contributes to the user i, and the value is determined by the sum of the similarities among user j and the following users. The value is expressed as:

00246_psisdg12260_1226027_page_4_4.jpg

where A(j) is the set of users followed by user j, k represents the k-th user in the set, and N denotes the total number of users followed by the same user. We defined the initial value of MUI-ISIDA as one and the MUI-ISIDA value could converge to a certain constant through repeated iterations. Consequently, the MUI-ISIDA value of all users can be obtained.

4.

EXPERIMENT AND ANALYSIS

4.1

Data set

Sina microblog is a social network platform similar to Twitter. Until October 2020, monthly active users have reached 523 million. Therefore, we selected Sina microblog as the data source. We had obtained a batch of relevant users data. The total number of friends/followers records reaches 83,000. The dataset information is shown in Table 1.

Table 1.

Data set information table.

StatisticsData value
User counts12746
Time2015.8
Original microblog counts441687
Forwarded microblog counts66289
Comment counts55784
Per capita microblog counts39.85

4.2

Experimental results and analysis

To evaluate the validity of the ranking results, we listed the top-10 users of PageRank and MUI-ISIDA. The results are demonstrated in Tables 2 and 3.

Table 2.

Top-10 users of PageRank.

UsernameFollowers countsInteraction countsMicroblog countsMicroblog qualityPR value
Prometheus80467351.914.1309
Liu HuiZ6251662.673.4485
Zhang RuiJ45336103.63.1501
Zhang W45514280.53.1263
Ju Geg8815270.193.0019
He BaoH59620250.82.9966
Yang B60210250.42.8629
EllieAga928723.52.8158
Xing XueP39017310.552.7824
Shang LeiM48895511.862.7631

Table 3.

Top-10 Users of MUI-ISIDA.

UsernameFollowers countsInteraction countsMicroblog countsMicroblog qualityMUI-ISIDA value
Shang LM48895511.866.1642
Zhang PW267144197.585.3367
Lun Zi26662421.484.6187
Prometheus80467351.913.7657
Wu YiT45248202.43.5735
Li Yang49842281.53.3868
Chen XiaoH22679243.293.3493
Fang F29849172.883.0384
Zhao XueM61737281.322.9471
Zeng Gy242128681.882.8498

4.2.1

Top-10 users of the two methods.

As is shown in Tables 2 and 3, the interaction counts include the total number of the user’s microblog posts have been forwarded and commented. The quality of microblog is defined as the ratio of the user’s interaction counts to the number of microblog posts. It can be seen in Table 2 that some top users of PageRank have high follower counts, which demonstrates that the number of followers is directly related to user influence ranking. In the results of MUI-ISIDA, the top users seem relatively reasonable. More followers do not connote higher influence. Take ‘Zhang PW’ and ‘Zeng Gy’ as examples, although their followers counts are small, but other users interact with them frequently, so they have higher influences. That’s to say, the interaction counts are directly related to user influence in MUI-ISIDA.

4.2.2

The accuracy of user influence ranking.

In this part, we compared our algorithm with PageRank and the MR-UIRank11, and analysed the experimental results. If a user’s microblog posts are forwarded or commented more frequently, it means that he has a higher influence. Therefore, the number of interactions can reflect the user’s influence. In addition, users tend to browse high-quality microblog contents and interact with the information they are interested in. The higher quality of a user’s microblog posts, the more frequent interaction will be attracted. So the quality of microblog posts surely reflect the user influence.

In Figures 2 and 3, the abscissa denotes the top k users of the three methods, the ordinate represents the hit rates of the corresponding results in microblog interaction rankings and microblog quality rankings. It can be seen from figures 2 and 3 that when the number of top k is 30, the hit rates of the MUI-ISIDA both increased by 23.3% compared with PageRank. The hit rates increased by 6.7% and 3.4% when compared with MR-UIRank respectively, good experimental results have been achieved. This is because the MUI-ISIDA algorithm considers multi-dimensional factors, and reasonably integrates into PageRank. In this paper, the hit rate is used as an indicator of the accuracy of judgment. The higher hit rate, the higher accuracy is. Experimental results show that the accuracy of MUI-ISIDA algorithm in influence calculation is better than PageRank and MR-UIRank.

Figure 2.

Microblog quality ranking.

00246_psisdg12260_1226027_page_6_1.jpg

Figure 3.

Interaction ranking.

00246_psisdg12260_1226027_page_6_2.jpg

5.

CONCLUSIONS

We proposed a new algorithm to calculate users’ influences, which fully integrated the interest theme similarity and information dissemination ability. The proposed algorithm improves the effectiveness and objectivity of user influence calculation to a certain extent. On the one hand, we analyzed users’ original microblog posts to capture their interests, so as to distribute the influence reasonably. On the other hand, we quantified users’ behaviors and verified the effectiveness of them. Finally, we tested our algorithm on microblog datasets. According to the experimental results, our proposed algorithm has achieved a higher accuracy than other state-of-the-art algorithms.

ACKNOWLEDGEMENT

The work was supported by grants from the Nature Science Foundation of Anhui Province in China, No.2008085MF193 and No.1908085MF194, the University Synergy Innovation Program of Anhui Province, No. GXXT-2019-008, the Outstanding Young Talents Program of Anhui Province, Grant Number gxyqZD2018060, Provincial quality project of Anhui Province Education Department, No.2019jyxm0285. And we thank all the anonymous reviewers for their hard work and valuable comments.

REFERENCES

[1] 

Liu, Q., Xiang, B., Yuan, N. J., Chen, E. H., Xiong, H., Zheng, Y. and Yang, Y., ACM Transactions on Knowledge Discovery from Data, 11 ((3)), 1 –30 (2017). Google Scholar

[2] 

Wen, Z., Kveton, B., Valko, M. and Vaswani, S., Advances in Neural Information Processing Systems, 3022 –3032 (2017). Google Scholar

[3] 

Page, L., Brin, S., Motwani, R. and Winograd, T., Stanford Digital Libraries Working Paper, 9 ((1)), 1 –14 (1998). Google Scholar

[4] 

Weng, J., Lim, E. P., Jiang, J. and He, Q., Proc. of the Third International Conf. on Web Search and Web Data Mining, New York, USA, (2010). Google Scholar

[5] 

Zhao, J., Gui, X. and Tian, F., IEEE Access, 5 3008 –3015 (2017). https://doi.org/10.1109/ACCESS.2017.2672680 Google Scholar

[6] 

Liu, B. Y., Wang, C. R., Wang, C., Wang, J. W. and Huang, M., Journal of Software, 28 ((2)), 246 –261 (2017). Google Scholar

[7] 

Blei, D. M., Ng, A. Y. and Jordan, M. I., Journal of Machine Learning Research, 3 993 –1022 (2012). Google Scholar

[8] 

Cheng, S. and Wang, W., Information, 11 (1), 4 (2020). https://doi.org/10.3390/info11010004 Google Scholar

[9] 

Nguyen, H. V. and Vreeken, J., in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 173 –189 (2015). Google Scholar

[10] 

Huang, M., Zhang, B., Zou, G., Gu, C. and Zhu, Z., in Int. Conf. on Big Data Intelligence & Computing & Cyber Science & Technology Cong, 124 –131 (2016). Google Scholar

[11] 

Sun, H. and Zuo, T., Journal of Chinese Computer Systems, 39 ((1)), 42 –47 (2018). Google Scholar
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Shulin Cheng, Ziming Wang, Meng Qian, Shan Yang, and Xin Zheng "MUI-ISIDA: an improved calculation method for user influence in social networks", Proc. SPIE 12260, International Conference on Computer Application and Information Security (ICCAIS 2021), 1226027 (24 May 2022); https://doi.org/10.1117/12.2637483
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Social networks

Diffusion tensor imaging

Internet

Mining

Analytical research

Lithium

Statistical analysis

Back to Top