David Corne ( home)
Regarding the Web Intelligence and OAB exams
This is the first run of the
Web Intelligence and OAB modules with my involvement (in half of the WI module
and 1/3 of the OAB module). Since you have no resource of past papers containing
questions about my material (and since everyone keeps asking me) this page
contains some helpful (but not too helpful) information.
In each case, the exam has 4 questions, and you have to answer
any three of them. On the WI paper, since I did half of the lectures, you can
reasonably expect that there will often be TWO questions from me. On the OAB
paper, you can reasonably expect that ONE of the questions will be on my
material.
General notes about examinable material:
Everything in the following list is EXAMINABLE:
- my powerpoint slides ("DWC weeks" 1, 2 and 3
for both WI and OAB, weeks 4 and 5 for WI only)
- the SWANSON paper (WI and OAB)
- the SINKA paper (WI and OAB)
- the ADDITIONAL NOTES that accompanied assignment 3 (WI and
OAB)
- the HITS paper (but only the parts of it indicated in the
lecture slides) (WI only)
This means that my questions will assume that you have read
and basically understand all of these things. HOWEVER, the understanding I
expect concerning the papers is not complete and detailed. I just
assume and expect a general and high-level
understanding of these, and a grasp of the main points. For example, example
part questions concerning the papers might be:
- Regarding the Sinka and Corne paper, what were the
general findings concerning the use of stemming? [5 marks] In
answer to this, no need to remember the datasets involved, details of
results etc ...; this asks about one of the general conclusions of the
paper. Basically, stemming is often used in practice, but it does not always
help -- it seems that stemming can sometimes be unhelpful when your system
is trying to make distinctions between very similar types of documents.
- Regarding the Swanson et al paper, describe in broad
terms the roles of Procedure I and Procedure II in the Arrowsmith system. [8
marks] Note that I would not ask for fine detail, such as system diagrams
and pseudocode (UNLESS SUCH DETAIL WAS INCLUDED IN SLIDES); the question is
asking for yur understanding of what basically happens in these procedures.
All other papers are not examinable, they are just recommended
reading. As for these papers (the google paper, etc...), if you use knowledge
gained from them appropriately within an answer to an exam question, that will
probably impress me -- however this is the same for any appropriate general
knowledge and understanding that you exhibit, which was not given in the
examinable material.
Example part-questions:
- Draw the complete graph on 5 nodes; what is its average
node degree, and what is it's diameter? [5 marks]
- Briefly explain why the study of complex networks is
important. [5 marks]
- [I might provide an example graph with about 10 nodes, and
then ask]: What is the degree distribution of this graph? What is the
diameter of this graph? Etc ...
- Give the equation for the cluster coefficient of a network,
and calculate it for the network below [here I might give a simple example
network, like the ones in the slides] - [7 marks]
- Write clear pseudocode for a procedure that can find
the shortest path (i.e the shortest number of hops) between two nodes in a
graph. [6 marks]
- In connection with a graph's degree distribution, write
down the simple equation for the power law, and explain what can be inferred
about the graph if you know the exponent of the power law distribution. [10
marks]
- Explain what `transitive closure' is with regard to
articles on the web, and briefly discuss three challenges that must be solved
before we can make progress towards achieving it. [15 marks]
(WI only, of course):
- Write down and explain the equation for working out the
PageRank of a web page. [7 marks]
- State and briefly explain/discuss three important design
goals for a search engine. [8 marks]
- Describe the role of the Barrels in google's design, and
explain why there are two types of barrels. [9 marks]
- Write pseudocode for the HITs algorithm [10 marks]
- Explain the assumptions that underpin Axelrod's model for
Cultural Dissemination, and write pseudocode for the simple model itself. [10
marks]
`Pseudocode' just means any clear description of an algorithm
or method. What I have in the examinable material (e.g. for HITS, for Google
query evaluation, for graph search, and so on ...) is pseudocode.
------------------------------------------------------
This page last updated: just now.