If you're a UX researcher working with open-ended surveys, interviews, or usability session notes, you probably know the challenge: qualitative data is rich - but messy. Traditional coding is time-consuming, sentiment tools feel shallow, and it's easy to miss the deeper patterns hiding in user feedback. These days, we're seeing new ways to scale thematic analysis without losing nuance. These aren’t just tweaks to old methods - they offer genuinely better ways to understand what users are saying and feeling. Emotion-based sentiment analysis moves past generic “positive” or “negative” tags. It surfaces real emotional signals (like frustration, confusion, delight, or relief) that help explain user behaviors such as feature abandonment or repeated errors. Theme co-occurrence heatmaps go beyond listing top issues and show how problems cluster together, helping you trace root causes and map out entire UX pain chains. Topic modeling, especially using LDA, automatically identifies recurring themes without needing predefined categories - perfect for processing hundreds of open-ended survey responses fast. And MDS (multidimensional scaling) lets you visualize how similar or different users are in how they think or speak, making it easy to spot shared mindsets, outliers, or cohort patterns. These methods are a game-changer. They don’t replace deep research, they make it faster, clearer, and more actionable. I’ve been building these into my own workflow using R, and they’ve made a big difference in how I approach qualitative data. If you're working in UX research or service design and want to level up your analysis, these are worth trying.
Analyzing Experimental Results Effectively
Explore top LinkedIn content from expert professionals.
-
-
Me, watching someone misdescribe p-values at a conference ……Do you think you can pass the P-value explanation test❓. First A p-value is → Not a badge of truth or a certificate of real-world impact ➊ 𝗔 𝗽-𝘃𝗮𝗹𝘂𝗲 𝗶𝘀 → The probability of observing results as extreme (or more extreme) as yours → G𝗶𝘃𝗲𝗻 𝘁𝗵𝗮𝘁 𝘁𝗵𝗲 𝗻𝘂𝗹𝗹 𝗵𝘆𝗽𝗼𝘁𝗵𝗲𝘀𝗶𝘀 𝗶𝘀 𝘁𝗿𝘂𝗲 ————————— For example: ➋ A 𝗽-𝘃𝗮𝗹𝘂𝗲 𝗼𝗳 𝟬.𝟬𝟯 𝗱𝗼𝗲𝘀 𝗻𝗼𝘁 𝗺𝗲𝗮𝗻: → “My intervention worked” → “There’s a 97% chance the null is false” → “We’ve found definitive proof” Instead… → It means there’s a 3% chance that you would observe results this strong (or stronger) if there were truly no effect. ➌ 𝗪𝗵𝘆 𝗱𝗼𝗲𝘀 𝟬.𝟬𝟱 𝗺𝗮𝘁𝘁𝗲𝗿? → In research, we often use 0.05 as a conventional cutoff for statistical significance → If your p-value is less than 0.05, we say the result is “statistically significant” → This means: it’s unlikely the observed results happened by chance under the null BUT → Statistical significance ≠ practical relevance → p < 0.05 doesn’t mean “definitely effective” → And p > 0.05 doesn’t mean “no effect at all” ➍ 𝗧𝗵𝗶𝘀 𝗶𝘀 𝘄𝗵𝗲𝗿𝗲 𝗺𝗼𝘀𝘁 𝗴𝗲𝘁 𝗶𝘁 𝘄𝗿𝗼𝗻𝗴 → They treat p-values as a truth switch: “Yes” or “No” → But statistics is nuance. Interpretation matters. → And misrepresenting the basics undermines public trust in science. ————————- 💬 Have you seen p-values misused or misunderstood in public discussions? ♻️ Repost to help raise the bar for statistical literacy in public health. #StatisticalThinking #PValueMyths
-
Ever noticed how two UX teams can watch the same usability test and walk away with completely different conclusions? One team swears “users dropped off because of button placement,” while another insists it was “trust in payment security.” Both have quotes, both have observations, both sound convincing. The result? Endless debates in meetings, wasted cycles, and decisions that hinge more on who argues better than on what the evidence truly supports. The root issue isn’t bad research. It’s that most of us treat qualitative evidence as if it speaks for itself. We don’t always make our assumptions explicit, nor do we show how each piece of data supports one explanation over another. That’s where things break down. We need a way to compare hypotheses transparently, to accumulate evidence across studies, and to move away from yes/no thinking toward degrees of confidence. That’s exactly what Bayesian reasoning brings to the table. Instead of asking “is this true or false?” we ask: given what we already know, and what this new study shows, how much more likely is one explanation compared to another? This shift encourages us to make priors explicit, assess how strongly each observation supports one explanation over the alternatives, and update beliefs in a way that is transparent and cumulative. Today’s conclusions become the starting point for tomorrow’s research, rather than isolated findings that fade into the background. Here’s the big picture for your day-to-day work: when you synthesize a usability test or interview data, try framing findings in terms of competing explanations rather than isolated quotes. Ask what you think is happening and why, note what past evidence suggests, and then evaluate how strongly the new session confirms or challenges those beliefs. Even a simple scale such as “weakly,” “moderately,” or “strongly” supporting one explanation over another moves you toward Bayesian-style reasoning. This practice not only clarifies your team’s confidence but also builds a cumulative research memory, helping you avoid repeating the same arguments and letting your insights grow stronger over time.
-
Stop losing your analysis files. The difference between chaos and clarity in computational biology is one habit: How you name your files/folders 🧵 1/ Ever dug through a maze of folders named “final,” “final2,” “final_really”—only to wonder which one held the real results? That’s chaos. 2/ The cure? Name folders with dates. Every run. Every project. Every result. It’s the simplest way to never lose track of your work again. 3/ Try this in Linux/macOS: mkdir $(date +%F) It creates a folder like 2025-03-26 in YYYY-MM-DD format. Clean. Automatic. Foolproof. 4/ Why this format works: It sorts naturally in file explorers. It avoids confusion (is 03-05 March 5th or May 3rd?). 5/ Example: You’re running RNA-seq. Structure it like this: project/ raw_data/ results/2025-03-26/ scripts/2025-03-26_differential_analysis.Rmd No guessing. No overwrites. Just clarity. 6/ If you analyze data daily, automate it: #!/bin/bash mkdir -p results/$(date +%F) Every run gets its own results folder. No exceptions. 7/ More naming habits that save your sanity: Use underscores, never spaces. Keep names short but clear (2025-03-26_qc_reads.txt). Avoid symbols like !@#* that break scripts. 8/ Want to go deeper? Read this classic guide: https://lnkd.in/efYfB-xz It’s a must-read for any computational biologist. 9/ Key takeaways: Date your directories. Automate consistency. Respect your future self. Your science is too important to get lost in a folder called “final_final2.” I hope you've found this post helpful. Follow me for more. Subscribe to my FREE newsletter chatomics to learn bioinformatics https://lnkd.in/erw83Svn
-
“I ran an experiment showing positive lift but didn’t see the results in the bottom line.” I think we’ve all had this experience: We set up a nice, clean A/B test to check the value of a feature or a creative. We get the results back: 5% lift, statistically significant. Nice! Champagne bottle pops, etc., etc. Since we got the win, we bake the 5% lift into our forecast for next quarter when the feature will roll out to the entire customer base and we sit back to watch the money roll in. But then, shockingly, we do not actually see that lift. When we look at our overall metrics we may see a very slight lift around when the feature got rolled out, but then it goes back down and it seems like it could just be noise anyway. Since we had baked our 5% lift into our forecast, and we definitely don’t have the 5% lift, we’re in trouble. What happened? The big issue here is that we didn’t consider uncertainty. When interpreting the results of our A/B test, we said “It’s a 5% lift, statistically significant” which implies something like “It’s definitely a 5% lift”. Unfortunately, this is not the right interpretation. The right interpretation is: “There was a statistically significant positive (i.e., >0) lift, with a mean estimate of 5%, but the experiment is consistent with a lift result ranging from 0.001% to 9.5%”. Because of well-known biases associated with this type of null-hypothesis testing, it’s most likely that the actual result was some very small positive lift, but our test just didn’t have enough statistical power to narrow the uncertainty bounds very much. So, what does this mean? When you’re doing any type of experimentation, you need to be looking at the uncertainty intervals from the test. You should never just report out the mean estimate from the test and say that’s “statistically significant”. Instead, you should always report out the range of metrics that are compatible with the experiment. When actually interpreting those results in a business context, you generally want to be conservative and assume the actual results will come in on the low end of the estimate from the test, or if it’s mission-critical then design a test with more statistical power to confirm the result. If you just look at the mean results from your test, you are highly likely to be led astray! You should always be looking first at the range of the uncertainty interval and only checking the mean last. To learn more about Recast, you can check us out here: https://lnkd.in/e7BKrBf4
-
Peer review is the cornerstone of scholarly publishing. Some reviewers offer gentle, yet unhelpful feedback. Others may be harsh but give insightful comment. Striking the right balance is key. Let me share my approach on being the 'just right' peer reviewer The are 2 parts Part 1: What to pay attention to (per section) Part 2: Scripts on how to critique politely ----------- Part 1: 📝1️⃣ Abstract: • Is it a short, clear summary of the aims, key methods, important findings, and conclusions? • Can it stand alone? • Does it contain unnecessary information? 🚪2️⃣ Introduction: Study Premise: Is it talking about something new on something old? • Does it summarize the current state of the topic? • Does it address the limitations of current state in this field? • Does it explain why this study was necessary? • Are the aims clear? 🧩3️⃣ Methods: • Study design: right to answer the question? • Population: unbiased? • Data source and collection: clearly defined? • Outcome: accurate, clinically meaningful? • Variables: well justified? • Statistical analysis: right method, sufficient power? • Study robustness: sensitivity analysis, data management. • Ethical concerns addressed? 🎯4️⃣ Results: • Are results presented clearly, accurately, and in order? • Easy to understand? • Tables make sense? • Measures of uncertainty (standard errors/P values) included? 9/16: 📈6️⃣ Figures: • Easy to understand? • Figure legends make sense? • Titles, axis clear? 🌐7️⃣ Discussion: The interpretation. • Did they compare the findings with current literature? • Is there a research argument? (claim + evidence) • Limitations/strengths addressed? • Future direction? 📚8️⃣ References: • Key references missing? • Do the authors cite secondary sources (narrative review papers) instead of the original paper? ------------ Part 2: 🗣️ How do you give your critique politely? Use these scripts. Interesting/useful research question, BUT weak method: - The study premise is strong, but the approach is underdeveloped." Robust research method, BUT the research question is not interesting/useful: -"The research method is robust and well thought out, but the study premise is weak." Bad writing: -"While the study/ research appears to be strong, the writing is difficult to follow. I recommend the authors work with a copyeditor to improve the flow/clarity and readability of the text" Results section do not make sense: -"The data reported in {page x/table y} should be expanded and clarified." Wrong interpretation/ wrong conclusion: -"The authors stated that {***}, but the data does not fully support this conclusion. We can only conclude that {***}. Poor Discussion section -"The authors {did not/fails to} address how their findings relate to the literature in this field." Copy this post into a word document and save it as a template. Use this every time you have to review a paper. If you are the receiver of peer review - you can also use this to decode what the reviewer is saying.😉
-
𝐒𝐭𝐫𝐮𝐜𝐤 𝐛𝐲 𝐂𝐔𝐏𝐄𝐃: Imagine you're measuring the impact of a new feature on user engagement. Some users naturally have higher engagement than others, creating "noise" that makes it harder to detect the true effect of your feature. CUPED (Controlled-experiment Using Pre-Experiment Data) uses historical data (from before the experiment) to account for these pre-existing differences. Think of it like this: If you know Alice typically spends 2 hours per day on your app while Bob spends 30 minutes, you can "adjust" their experiment behavior based on these baselines. Instead of comparing their raw usage during the experiment, you compare how much they deviated from their usual patterns. By removing this pre-existing variance, CUPED can help: 1️⃣ Detect smaller changes more reliably 2️⃣ Reach statistical significance faster 3️⃣ Run experiments with smaller sample sizes For those familiar with statistics and econometrics, CUPED may seem similar to simply adding pre-experiment covariates in a linear regression or even a differences-in-differences approach, but it is in fact a slightly more efficient approach given it is designed specifically for reducing variance and increasing experimental power. It is one of those rare instances where you get some benefit without having to trade-off anything. In most industrial experimentation platforms, CUPED is applied by default to every analysis. See the visualization of how CUPED reduces variance and increases experimentation power (𝘍𝘪𝘨𝘶𝘳𝘦 𝘤𝘳𝘦𝘥𝘪𝘵𝘴: 𝘉𝘰𝘰𝘬𝘪𝘯𝘨.𝘤𝘰𝘮 𝘣𝘭𝘰𝘨, 𝘭𝘪𝘯𝘬 𝘪𝘯 𝘤𝘰𝘮𝘮𝘦𝘯𝘵𝘴)
-
In earlier posts, I've discussed the immense promise and major risks associated with the new wave of text-prompted AI analytical tools, e.g., ADA, Open Interpreter, etc. Here are some best practices to avoid these pitfalls... 🔸 Prepare Written Analysis Plans - many Data Analysts are unfamiliar with this approach and even fewer regularly implement it ( < 20% by my estimates). But preparing and sharing a written plan detailing your key questions and hypotheses (including their underlying theoretical basis), data collection strategy, inclusion/exclusion criteria, and methods to be used prior to performing your analyses can protect you from HARKing (hypothesizing after results are known) and generally increase the integrity, transparency and effectiveness of your analyses. Here's a prior post with additional detail: https://lnkd.in/g6VyqCsc 🔸 Split Your Dataset Before EDA - Exploratory Data Analysis is a very valuable tool, but if you perform EDA and confirmatory analyses on the same dataset, you risk overfitting, and expose your analysis to risks of HARKing and p-hacking. Separating your dataset into exploratory and confirmatory partitions allows you to explore freely without compromising the integrity of subsequent analyses, and helps ensure the rigor and reliability of your findings. 🔸 Correct for Problem of Multiple Comparisons - also known as the "Familywise Error Rate", this refers to inflating the probability of a Type I error when performing multiple hypotheis tests within the same analysis. There are a number of different methods for performing this correction, but care should be taken in the selection since they have tradeoffs between likelihoods of Type I (i.e., "false positive) and Type II (i.e., false negative) errors. 🔸 Be Transparent - fully document the decisions you make during all of your analyses. This includes exclusion of any outliers, performance of any tests, and any deviations from your analysis plan. Make your raw and transformed data, and analysis code available to the relevant people, subject to data sensitivity considerations. 🔸 Seek Methodological and Analysis Review - have your analysis plan and final draft analyses reviewed by qualified Data Analysts/Data Scientists. This will help ensure that your analyses are well-suited to the key questions you are seeking to answer, and that you have performed and interpreted them correctly. None of these pitfalls are new or unique to AI analytic tools. However, the power of these tools to run dozens or even hundreds of analyses at a time with a single text prompt substantially increases the risks of running afoul of sound analytical practices. Adhering to the principles and approaches detailed above will help ensure the reliability, validity and integrity of your analyses. #dataanalysis #statisticalanalysis #ai #powerbi
-
Still struggling with where to start when you are given a project? I have got you! Below is a step-by-step breakdown of key tasks to complete on a data analytics project. 1. Define The Project Objectives and Deliverables 🔹Identify the key questions or goals Why? A clear goal directs what data you need and how you will analyze it. 2. Understand the Structure of your Tables 🔹Examine each table's schema: columns, data types, relationships, and keys Why? This is helpful before any meaningful combination or analysis. Note: Most of the time, your project's data is located in different tables. 3. Prepare and Clean the Data 🔹Handle missing values 🔹Remove duplicates 🔹Fix formatting issues 🔹Ensure consistent units/currency/date formats Why? Data cleaning is often the most time-consuming part, but it is essential for ensuring accuracy and reliability in your analysis. 4. Combine/Merge the Tables 🔹Use keys or common fields to combine tables Why? It creates a complete dataset by bringing together relevant information from all the tables. It improves data quality and ensures that the analysis is comprehensive. 6. Data Enrichment (Optional) 🔹Create new variables or derive new metrics 🔹Create a date table using the date column from your table Why? It provides additional context and improves the power of your analysis by revealing deeper insights. 5. Conduct Exploratory Data Analysis (EDA) 🔹Run summary statistics 🔹Explore patterns, trends, and anomalies in your dataset Why? EDA helps you uncover patterns, spot errors, and decide which variables matter for analysis. 7. Perform Analysis 🔹Compare trends across time, regions, or segments 🔹Apply analytical techniques to answer initially defined questions 🔹Build KPIs Why? Here, you extract actionable insights from your prepared dataset and test hypotheses, directly addressing your project’s objectives. 8. Visualize Results 🔹Create different charts 🔹Use any visualization tool Why? It helps stakeholders understand results more easily through clear visuals. 9. Interpret and Report your Results 🔹Tell the story behind the data to communicate findings through reports or presentations tailored to your audience 🔹Explain what the analysis reveals, what it means, and why it matters 🔹Use concise reports, presentations, or dashboards Why? It converts technical output into business-relevant insights. This helps stakeholders make informed decisions based on your analysis. 10. Make Data-Driven Recommendations 🔹Validate your findings by checking for errors, testing assumptions, and possibly seeking feedback from others 🔹Suggest actions to be taken Why? Validation ensures the credibility and robustness of your conclusions before they are used in decision-making. 11. Monitor & Iterate 🔹Evaluate the impact of implemented changes 🔹Re-analyze periodically 🔹Update data pipelines or dashboards as needed Why? It ensures your analysis stays useful and responsive to changes. PS: What step can you add?