Was a website redesign successful?

Was a website redesign successful?#

A DataCamp challenge May, 2023

A/B testing

The project#

You work for an early-stage startup in Germany. Your team has been working on a redesign of the landing page. The team believes a new design will increase the number of people who click through and join your site.

They have been testing the changes for a few weeks and now they want to measure the impact of the change and need you to determine if the increase can be due to random chance or if it is statistically significant.

The team assembled the following data file:

treatment - “yes” if the user saw the new version of the landing page, no otherwise.
new_images - “yes” if the page used a new set of images, no otherwise.
converted - 1 if the user joined the site, 0 otherwise.

The control group is those users with “no” in both columns: the old version with the old set of images.

Complete the following tasks:

Analyze the conversion rates for each of the four groups: the new/old design of the landing page and the new/old pictures.
Can the increases observed be explained by randomness?
Which version of the website should they use?

      treatment new_images  converted
         yes        yes          0
         yes        yes          0
         yes        yes          0
         yes         no          0
          no        yes          0
...         ...        ...        ...
      no         no          0
     yes        yes          0
     yes        yes          0
      no         no          0
     yes        yes          0

[40484 rows x 3 columns]

Data validation#

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 40484 entries, 0 to 40483
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   treatment   40484 non-null  object
 1   new_images  40484 non-null  object
 2   converted   40484 non-null  int64 
dtypes: int64(1), object(2)
memory usage: 949.0+ KB

Let’s see if the four groups are equally represented in the data.

treatment  new_images
no         no            0.25
           yes           0.25
yes        no            0.25
           yes           0.25
Name: proportion, dtype: float64

Yes, they are. They are equally represented so it will help reduce bias.

Data analysis#

I will calculate the conversion rates for each group.

		converted
treatment	new_images
no	no	0.107104
no	yes	0.112538
yes	no	0.120047
yes	yes	0.113724

Let’s visualize these values in a graph.

../_images/5918d25fa8e049f57acd26322714794deb2f1c6dde8f5a99389f3995e55a9e69.png

This is what we can see:

Control (actual): The control group has the lowest conversion rate among the four groups.
New images only: If new images are displayed in the same webpage design, the conversion rate increases, but the confidence intervals overlap with the control group, suggesting that the difference may not be significant.
New web only: There appears to be a clearer difference in comparison to the control group if the new website design is presented but the images remain unchanged.
New images + new web: Conversion rate confidence interval overlaps clearly with the control group.

Hypothesis testing#

I will calculate the mean-value distributions and represent them in a graph.

../_images/67196e6ef03cb2dfbcbcf1fac08b4059723dc555e796a94f90d142ac1d2640ac.png

As with the previous graph, it appears that the 95% confidence interval of the control group does not overlap with that of the “New web only” version of the page, but a hypothesis test is necessary to confirm this.

The null hypothesis I will consider is that the differences in mean values are purely due to chance, with all groups having the same distribution of mean values.

I will define a function that performs the following tasks:

Takes two measurement arrays and calculates the difference of their mean values.
Concatenates those measurement arrays in a single array to calculate combined mean value.
Shifts each array’s values to have the same combined mean.
Computes 10,000 bootstrap replicates, storing the mean of each of them.
Checks all 10,000 mean values to see how many of them resulted in a difference equal to or greater than the one we measured. Dividing by the number of samples (10,000) will give us the p-value.

1059 <- p-value 'New images' only

0019 <- p-value 'New webpage' only

0692 <- p-value 'New webpage + New images'

The lowest p-value obtained was 0.0019 (0.19%), which is lower than a standard significance level of 5%. This indicates that out of 10,000 random tests, only 19 of them resulted in a difference in mean values equal to or greater than the one we observed. This is a very small number, suggesting that it would occur very rarely. Therefore, we can reject the null hypothesis in this case and conclude that the difference in the conversion rate is likely due to the difference in the web page design without new images.

In the other two, the p-values of 0.069 (6.7%) and 0.1059 (10.59%) indicate that there are higher chances of having those differences occur by chance. As they are above the significance level of 5%, we will not draw the same conclusion as before and we fail to reject the null hypothesis in these cases.

Bayesian approach#

We will evaluate the problem using the Bayesian approach to obtain more information.

../_images/ec34c3723354da69156a8c5388da868f8ee1102b004acd0f6c9e1d08a3ed94b3.png

It is more probable that “New web only” option is better. Let’s check how sure we are about it.

../_images/c6f79fb300b3a1f02c5b5d19288b5a58c5be631bd298c04ebe7bc4ced1320284.png

The probability that 'New web only' option's true click-rate increase lies between 0.6% and 2.0% is 90%.

The probability that the 'New web only' option click rate is higher is 99.9%.
The risk of being this a wrong decision is therefore 0.1%.

Even if the wrong decision materializes, we will only lose 0.10% of click rates.

Conclusion#

They should definitely use the new version of the landing webpage without the new set of images.