



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Homework questions related to networks, strongly connected components, PageRank equilibrium, and hub-authority algorithm. Students are required to determine the largest strongly connected component of a given graph, identify links to add to increase its size, check if a set of numbers forms an equilibrium set of PageRank values, and compute hub and authority values for a network. The document also discusses strategies for creating pages to achieve high authority scores.
What you will learn
Typology: Cheat Sheet
1 / 5
This page cannot be seen from the preview
Don't miss anything!
Networks: Fall 2015 Homework 4 David Easley and Jon Kleinberg Due at 11:15am, October 23, 2015
As noted on the course home page, homework solutions must be submitted by upload to the CMS site, at https://cms.csuglab.cornell.edu/. The file you upload must be in PDF format. It is fine to write the homework in another format such as Word, as long as it’s saved out as PDF. (From Word, for example, you can save files into PDF format.)
The CMS site will stop accepting homework uploads after the posted due date. We cannot accept late homework except for University-approved excuses (which include illness, a family emergency, or travel as part of a University sports team or other University activity).
Reading: The questions below are primarily based on the material in Chapters 13 and 14. of the book.
7
10
13
4
1
8
11
14
5
2
9
12
15
6
3
Figure 1: The network of Web pages for Question (1).
(1) Consider the directed graph shown in Figure 1, with nodes representing Web pages and each directed edge representing a link from one Web page to another.
(a) List the nodes in the largest strongly connected component of this graph.
(b) As new links are created or old ones are removed among an existing set of Web pages, such as the one in Figure 1, the set of nodes in the largest strongly connected component can change. Here’s an example of how such a change can occur, through the addition of an edge. Suppose you are allowed to add one link to the graph in Figure 1, going from one node in the figure to another; which link would you add if you wanted to increase the size of the largest strongly connected component by as much as possible? Give an explanation for your answer.
B
A
C
D
E
G
1/4 1/8 1/
1/4 (^) 1/
1/
Figure 2: The network of Web pages for Question (2).
(2) Let’s consider the limiting PageRank values that result from the Basic PageRank Update Rule (i.e. the version where we don’t introduce a scaling factor s). In Chapter 14, these limiting values are described as “exhibiting the following kind of equilibrium: if we take the limiting PageRank values and apply one step of the Basic PageRank Update Rule, then the values at every node remain the same. In other words, the limiting PageRank values regenerate themselves exactly when they are updated.” This description gives a way to check whether an assignment of numbers to a set of Web pages forms an equilibrium set of PageRank values: the numbers should add up to 1, and they should remain unchanged when we apply the Basic PageRank Update Rule. (See for example Figure 14.7 in Chapter 14 of the book.) Try this on the network of Web pages shown in Figure 2. In particular, say whether the indicated set of numbers forms an equilibrium set of PageRank values under the Basic PageRank Update Rule. Also, provide an explanation for your answer: specify either why they form an equilibrium, or how they fail to form an equillibrium.
divide each hub score by the sum of all hub scores. (We will call the scores obtained after this dividing-down step the normalized scores. Show the values both before and after this final normalization step. It’s fine to write the normalized scores as fractions rather than decimals.
C
D
E
A
B
F
Figure 4: The network of Web pages for Question (4).
(b) Now we come to the issue of creating pages so as to achieve large authority scores, given an existing hyperlink structure. In particular, suppose you wanted to create a new Web page X, and add it to the network in Figure 4, so that it could achieve a (normalized) authority score that is as large as possible. One thing you might try is to create a second page Y as well, so that Y links to X and thus confers authority on it. In doing this, it’s natural to wonder whether it helps or hurts X’s authority to have Y link to other nodes as well. Specifically, suppose you add X and Y to the network in Figure 4. In order to add X and Y to this network, one needs to specify what links they will have. Here are two options; in the first option, Y links only to X, while in the second option, Y links to other strong authorities in addition to X.
For each of these two options, we’d like to know how X fares in terms of its authority score. So, for each option, show the normalized authority values that each of A, B, and X get when you run the 2-step hub-authority computation on the resulting network (as in part (a)). (That is, you should perform the normalization step where you divide each authority value down by the total.)
For which of Options 1 or 2 does page X get a higher authority score (taking normalization into account)? Give a brief explanation in which you provide some intuition for why this option gives X a higher score.
(c) Suppose instead of creating two pages, you create three pages X, Y , and Z, and again try to strategically create links out of them so that X gets ranked as well as possible. Describe a strategy for adding three nodes X, Y , and Z to the network in Figure 4, with choices of links out of each, so that when you run the 2-step hub-authority computation (as in parts (a) and (b)), and then rank all pages by their authority score, node X shows up in second place. (You can have the links from X, Y , and Z point to any nodes you want, including others among X, Y , and Z, and/or the existing nodes in Figure 4.) Show the hub and authority scores in the new network you create, to demonstrate that you’ve succeeded in getting node X into second place. (Note that there’s no way to do this so that X shows up in first place, so second place is the best one can hope for using only three nodes X, Y , and Z.)