The Pygmalion effect

The Pygmalion effect

In short the pygmaliion effect states that once an expectation is set, even if it isn’t accurate, we tend to act in ways that are consistent with that expectation. Surprisingly often, the result is that the expectation, as if by magic, comes true.

Think how our beliefs affect our perceptions, decisions and hence our actions. However, your beliefs are so
powerful that they can literally affect your biochemistry.

Yes, your beliefs can affect the state of your energy levels and your physical wellbeing.

Have you heard of the placebo effect?

Psychology of the markets

It’s been discovered, by repeated observations, that changes in mass psychology and therefore the markets are actually patterned.

These patterns repeat themselves and therefore these forecasts are probabilistic. That’s what makes markets predictable. Once you know what part of the pattern the market is in, you can make a probabilistic forecast as to where the market will go next.

This is the basis of technical analysis.

The Pomodoro Technique

The Pomodoro Technique is a way to get the most out of time management.

Pomodoro = originally italian tomato shaped timer …

Francesco Cirillo created the Pomodoro Techniqu in 1992. It is now practiced by professional teams and individuals around the world.  And I implemented it today. I must say it’s pretty effective and a very good tool for procrastinators.

What you need to get started:

A timer, a plain paper, an instrument to write.

Getting started:

1) Make you list of tasks on the blanck paper using your writing tool (duh :P )

2) Prioritize your tasks (balance by importance and shortest task )

3) Beside every task write down how many Pomodoro session (25 misn) the tasks requires to be completed.

3) Set the Pomodoro (the timer) for 25 mins.

4) Work like a do till the Pomodoro buzzez.

5) take a 5 min break.

6) take a 30 – 45 min break after 4 Pomodoro sessions.

7) You can comfortably  work  for 2 complete Pomodoro cycles (7 Pomodoro breaks).

If followed religiously, You can expect a sure exponential rise in you productivity.

Google for more info on this technique

Fear is Good!

“The point is, ladies and gentleman, that fear, for lack of a better word, is good. Fear is right, fear works. Fear clarifies, cuts through, and captures the essence of the evolutionary spirit. Fear, in all of its forms; fear for loss of life, of money, of love, knowledge has marked the difference between those that take informed risks and make money and those that remain petrified by fear in a perpetual state of inaction.”

Politicians, the chief peddlers of fear recognised this centuries ago!

From the market Oracle

Web personalization part 6 – Future

Some Discussions

Web personalization will enable: delivering services and advertisements based on user interest, and thereby improving the quality of user interaction and leading to higher customer loyalty. It is important to identify user groups and focus the site towards them [2]. Not all users are equal; some are more profitable to the organization than others. Thus, it is necessary to make sure that the site caters to the wanted users and the wanted users come to the site.

End users operate in a mode of not knowing what they want. End users operate in a mode of, “Give me what I say I want, then I can tell you what I really want.” They also operate in a mode of discovery. They say, “Ah – now that I see what the possibilities are, I can tell you what I really want. However, until I can see what the possibilities are, I really can’t tell you what I want.” [11] Information on how customers are using a web site is critical for marketers of e-commerce business. Four distinct steps are identified in customer relationship life cycle that can be supported by their knowledge discovery techniques: customer attraction, customer retention, cross sales and customer departure [4].

Primarily the usage data combined with the structure data, processed by the different mining algorithms offers this information. Using the different classification, clustering and association rules, the navigational behavior of the customer can be analyzed from different perspectives.

We collect all the clickstream data and the objective is to analyze it from a higher plane of reference. The objective is to really be able to infer the intent of site visitors. It’s not only about generating statistics and rules about web site users but understanding the psychology behind the numbers and rules and combining it with the company’s objectives. If the goal of a site is not defined clearly, it will not be measurable, and therefore it will be difficult to generate useful Key Performance Indicators for the website. John Quarto-vonTivadar chief scientist at Future Now and the inventor of “Persuasion Architecture” says: “True or False? If you had 100% metaphysical certitude analytics coverage and could know anything you wanted to know, would some companies still be unable to increase their conversion rate? I depressingly suspect the

answer is True…”

According to Avinash Kaushik, a web analyst and also the author of “Web analytics: An hour a day”, says that the single greatest root cause of failure with web analytics is the inability to understand what the site is trying to do, and hence defining goals. To reinforce this point he has developed the Trinity concept:

“The goal of the Trinity mindset is to power the generation of actionable insights. Its goal is not to do reporting. Its goal is not to figure out how to spam decision makers with data. Actionable Insights & Metrics are the uber-goal simply because they drive strategic differentiation and a sustainable competitive advantage.”

Cons of web mining: some ethical issues [12]:

Web mining the technology itself doesn’t create issues, but this technology when used on data of personal nature can cause concerns. The most criticized ethical issue involving web mining is the invasion of privacy. The growing trend of selling personal data as a commodity encourages website owners to trade personal data obtained from their site, which is a cause of concern to users.

[10] http://www.webstatsgold.com/webanalytics.htm

[11] http://www.informationmanagement.com/issues/20041101/1012403-1.html

[12] http://en.wikipedia.org/wiki/Web_usage_mining

[13] Web analytics: An hour a day – Avinash Kaushik

[14] http://www.kaushik.net/avinash/2009/05/webmetrics-analytics-questions-facebook-edition.htmlSome

Web personaliation part 5 – Web Analytic Metrics

Here are a few common web analytic metrics generated from clickstream data. Keeping in mind the goal of web mining and following the metrics listed below will help us better understand their use. Broadly all web mining goals can be classified as: 1] Increasing revenue, 2] reducing costs and 3] improving customer loyalty.

Conversion Rate = Transactions / Visit. This number answers, what percentage of customers bought a product, subscribed to a service, or filled out a form. [9]

Page Depth = Totals Pages Viewed / Visits. Page depth is a key metric for determining site stickiness. Stickiness determines how engaging a website is to a visitor. Typically, there is a direct correlation between page depth and conversion rates. The more pages a visitor sees, the better the chance the visitor converts. [9]

Bounce-Rate: A “bounce” occurs when a person leaves a website immediately without having viewed any page but the entry page. The number of bounces is compared to those who visit more than one page to give a ‘Bounce Rate’. There are two main problems that lead to a high bounce rate: Attracting the wrong kind of traffic and not giving the visitor what they were looking for. High bounce rate is also a good indicator to detect click frauds. [10]

A drop-out rate refers to a given process (say a purchasing process or a registration process) and the % of people who fail to get past that process successfully.

Path Analysis – gives the movement of the flow of visitors. It gives the sequence of hyperlinks one or more website visitors follows on a given site. The ideal path through the site should go from the homepage to the products page to the orders page, and finally to the checkout page. Deviations might include paths to tutorials, articles and other information pages. The marketer can select a particular visitor, or drop-out and then drill down to the detail page to reveal every page visited and path taken, as well as the amount of time spent viewing each page.

Page View Duration - Average amount of time that visitors spend on each page of the site. As with Session Duration, this metric is complicated by the fact that analytics programs can not measure the length of the final page view.

Cost per Acquisition = Cost / Visits. How much does it cost to bring a visitor to the website? CPA can help identify efficient and inefficient traffic sources/mediums. [9]

ROI = (Revenue – Cost) / Cost). ROI is the Holy Grail of Key Performance Indicators for any campaign that has an associate cost.

Some other information that can be inferred form clickstream data is: the IP-address of the user, the page (URL) the user requested, and the timestamp (down to seconds) of the event. From these 3 facts, we can derive a great deal of extra information. First, we can identify individual users, through the IP-address with IP geolocation which can then be mapped to the user’s country and city of origin. We can identify user sessions, including start page, end page and all pages visited between those two.

Finally, we can also group users by time between clicks and the time of day/week users are on the site. This is useful to monitor traffic from specific geographies and to detect click frauds.

These metrics are processed under different algorithms to infer the different types of information.

Some examples include:

Association rules are employed to discover associated events, products and pages.

Clustering is used to discover visitor groups with common properties, interests and common behavior.

Classification is used to characterize visitors with respect to a set of predefined classes. It is also used to detect card frauds.

It might be appropriate to include a reference to Search Engine optimization (SEO) at this stage. Search engine optimization is the process of changing various elements of a web site to optimized targeted traffic though search engines. This process is achieved by using the web analytic metrics to analyze the site. The goal of this analysis is to analyze the market, the target audience, and the result that the site owner wants to achieve. Some SEO metrics used are: number of links pointing to the site, keyword density, keyword proximity, number of searches per keyword, internal site links, URL normalization and page design. The analysis phase is followed by the implementation of these changes. The results are tracked and monitored to achieve better search engine ranking and a higher targeted traffic.

Search engine optimization primarily uses web content mining, and web structure mining.

[8] http://www.bipminstitute.com/datamining/propensity-cluster-associationpatterns.php

[9] http://www.weblinc.com/Our_Services/Metrics_Analytics/

Web personalization part 4 – Web Mining algorithms

Some Web Mining Algorithms

Web mining uses Data Mining Propensity models for discovering a natural inclination or tendency across the variables. This group of techniques primarily includes classification algorithms, association rule, and Sequential/temporal pattern analysis algorithms. [8]

a) Applying Classification algorithms in databases are the process of separating a data set into components that reflect a consistent pattern of behavior. Once the patterns have been established they can then be used to break data into more understandable subsets and provide sub-groups of a population for further analysis.

Clustering is used to find similar web-content data, to find users by their behavior and their geographic location. Classification methods include decision-tree methods such as C4.5, statistical methods, neural networks…

b) Association rules- are rules which imply certain association relationships among a set of objects in a database. Association rules include the Apriori algorithm, market basket analysis and others. These are used in recommendation engines like the ones used by Amazon to suggest the user other books that users who have previously bought while buying this particular book or books of similar interest to a group of users. Association rules also suggest user similar kinds of sites that other users have visited.

c) Sequential/temporal pattern functions analyze a collection of records over a period of time for example to identify trends. A sequential pattern function will analyze collections of related records and will detect frequently occurring patterns of events over time. This is done through click stream data from the web logs on the server. This algorithm analyses users navigational behavior through websites, and from page to page within websites. This information can then be used to optimize user navigation to keep them on the site for a longer period.

Web personalization – part 3 – correlations

The extraction of correlations between and across different kinds of data

Once the raw data has been processed and converted to information, the next step is pattern discovery. At this stage patterns are discovered and rules are made.

Following the pattern discovery stage is pattern analysis. The ways that are employed in order to analyze the collected data include content-based filtering, collaborative filtering, rule-based filtering, and Web usage mining.

Content-based filtering systems are solely based on individual users’ preferences. The system tracks each user’s behavior and recommends items to them that are similar to items the user liked in the past.

Collaborative filtering systems ask users to reveal their preferences and interests and then return information that is predicted to be of interest to them. This is based on the assumption that users with similar behavior (e.g. users that rate similar objects) have analogous interests. Collaborative filtering approaches have cold-start problem. [5].

In rule-based filtering the users are asked to answer a set of questions. These questions are derived from a decision tree, so as the user proceeds to answer them, what he finally receives as a result is tailored to his needs.

The present trend is leading towards the fourth technique, which is web mining. Web Mining applies data mining techniques on the web log data, resulting in a set of useful patterns that indicate users’ navigational behavior.

Web usage mining is the automatic discovery of patterns in clickstreams and associated data collected or generated as a result of user interactions with one or more Web sites [6]. The information gathered through Web mining is evaluated by using traditional data mining parameters.

There are broadly three knowledge discovery domains that pertain to web mining: web content mining, web structure mining, web usage mining.

The technologies that are normally used in web content mining are NLP (Natural language processing) and IR (Information retrieval ). Content mining is more than just keyword extraction, it uses “wrappers” to map documents to some data model. Content mining is used to mine the content of a document or to improve the search on the content. [7]

Web structure mining is the process of using graph theory to analyze the node and connection structure of a web site. For example, links pointing to a document indicate the popularity of the document, while links coming out of a document indicate the spread of topics covered in the document.

Web usage mining applies data mining techniques on access logs to reveal access patterns that can be used to restructure sites in a more efficient grouping, to drilldown effective advertising locations, and target specific users for personalized ads. [7]

[5] Web personalization expert with combining collaborative filtering and association rule mining technique – C. -H. Leea, Y. -H. Kim and P. -K. Rhee

[6] Web Data Mining – Bing Liu

[7] http://www.galeas.de/webmining.html