Building and Testing Stock Portfolios in R (2024)

Using data science to make smarter investment decisions

Building and Testing Stock Portfolios in R (1)

Published in

Towards Data Science


5 min read


Jun 23, 2020


Building and Testing Stock Portfolios in R (3)

In this article, we’ll examine how to get stock data, analyze it to make investment decisions, and visualize the results.

With the recent surge in retail investors entering the market, it’s more important than ever that new traders are armed with the tools they need to compare stocks by analyzing their performance over time. In this post, we’ll use the stock data of three familiar companies — Starbucks, Carnival, and Apple — to construct a portfolio, examine its historical performance, and compare it to the S&P 500.

Three very useful packages for financial analysis in R are quantmod , to pull stock data from Yahoo Finance; PerformanceAnalytics , to construct and test portfolios; and dygraphs , to produce interactive and informative visualizations of our data. If you don’t have these packages installed, you can install and load them into your R environment using the code below.

Once we have the packages installed and loaded, we can write a function to get monthly return data for our individual stocks. Our function takes two arguments: ticker , the stock’s symbol, and base_year , the year that we want to start analyzing the data.

This function can appear a bit complicated if you are unfamiliar with R and its packages. Here’s an explanation of each line:

  • Line 4: Passes our ticker symbol to the getSymbols() function to get our stock data from Yahoo Finance
  • Lines 5–8: Removes any missing values from the data and isolates the “Adjusted Price” column (the sixth column), which accounts for stock splits, dividends, and other corporate actions (more on that here)
  • Lines 10–12: Uses R’s built-in paste0() and Sys.Date()functions to create a string we can pass between brackets so that only observations between our base year and today’s date are selected
  • Line 15: Calculates monthly arithmetic returns for our adjusted closing stock price data
  • Line 18: Assigns our monthly return data to R’s global environment to ensure that we can access it by its ticker symbol later

After writing our function, it’s now available for us to calculate monthly returns for our three stocks. We need to call our function four times — once for each stock we want to analyze and once for the S&P 500 — so that we have a benchmark against which to judge each stock’s monthly performance. We can then merge all our monthly returns into one time series object and look at the last several years’ performance for each using the dygraphs package (check out the interactive version here). The last line prints the last five months of return data and its output is reproduced below.

Building and Testing Stock Portfolios in R (4)
Building and Testing Stock Portfolios in R (5)

From our returns data set, we can get a sense of how well each stock has performed relative to the S&P 500 over the last several years. For example, when news of the spread of COVID-19 was roiling financial markets in March 2020, the S&P 500 recorded a loss of about 12.5% while Carnival lost more than 60% of its value for that month. Apple, by contrast, experienced a loss of only about 7%. Using corrplot::corrplot(cor(returns), method = number) generates a correlation matrix to indicate how these stocks’ returns are related to each other.

Building and Testing Stock Portfolios in R (6)

A fundamental principle of portfolio management is that you should select stocks with low correlations to each other. You wouldn’t want all the stocks in your portfolio to always rise and fall together — that could expose you to excess volatility that you may want to avoid, especially if this is a retirement account where preservation of principle is your main concern. A portfolio made up of highly correlated stocks is subject to unsystematic risk stemming from the firm-specific risks inherent to each stock.

From our correlation matrix, we observe that all of our stocks are positively correlated, albeit in varying degrees. Apple is only weakly correlated with Starbucks (0.27), but Carnival and the S&P 500 have a high correlation (0.71) with each other. Importantly, all of our stocks have a fairly high positive correlation with the market, meaning they tend to move with the market most months.

We can use the PerformanceAnalytics package to assign weights to our stocks and build a hypothetical portfolio from them. In the following code, we assume that we are investing one-third of our money in Starbucks, one-third in Carnival, and one-third in Apple, excluding the S&P 500 for the moment. The Return.portfolio() function allows us to pass in our individual stock data from the returns object along with their weights. We can set the wealth.index argument equal to TRUE to show how $1 invested in our portfolio in 2015 would have grown over time. Then, we can follow the same process for the S&P 500, excluding the weights argument. After merging our data into one xts object, we can contrast our portfolio with the S&P 500 using another graph (interactive version here):

Building and Testing Stock Portfolios in R (7)

If we had invested $1 into our Starbucks-Carnival-Apple portfolio at the beginning of 2015 and didn’t touch it (i.e., no re-balancing), our portfolio’s value would have almost doubled (92.31%). This beat the performance of the S&P 500, which still yielded an impressive return of about 73% over the same time horizon. Of course, we could make some minor tweaks to our code to change portfolio weights, add additional stocks (or other assets like government bonds and precious metals), and experiment with re-balancing. With R and its libraries, our ability to construct and test portfolios is virtually unlimited.

I hope I’ve made the process of getting and analyzing stock data a little less intimidating. With the code above, we were able to import a large amount of financial information, construct a portfolio, examine its composition, and analyze historical performance relative to a benchmark. In future posts, I’ll explore some other ways we can use R and its libraries to analyze financial data.

Thanks for reading!

As a seasoned data science enthusiast with a focus on financial analysis, I bring forth a wealth of experience in utilizing data science techniques to make informed investment decisions. My expertise extends to the practical application of data science tools, particularly in the realm of stock analysis and portfolio management.

In the article titled "Using data science to make smarter investment decisions" by Christian Kincaid, the author delves into the process of obtaining stock data, analyzing it for investment decisions, and visualizing the results. The key concepts covered in the article include:

  1. Data Collection and Analysis Packages:

    • The author highlights the use of three essential packages in R for financial analysis: quantmod for pulling stock data from Yahoo Finance, PerformanceAnalytics for constructing and testing portfolios, and dygraphs for producing interactive visualizations.
  2. Monthly Return Data Function:

    • A custom function is introduced to obtain monthly return data for individual stocks. This function takes the stock's symbol (ticker) and the starting year for analysis (base_year) as arguments.
  3. Portfolio Performance Comparison:

    • The article emphasizes the importance of comparing stock performance against a benchmark, in this case, the S&P 500. The author demonstrates how to merge monthly returns for selected stocks and the S&P 500 into a time series object.
  4. Correlation Analysis:

    • The correlation matrix is employed to assess how the returns of different stocks are related to each other. The article underscores the significance of selecting stocks with low correlations to manage portfolio risk.
  5. Portfolio Construction and Testing:

    • The article uses the PerformanceAnalytics package to assign weights to individual stocks and build a hypothetical portfolio. The performance of this portfolio is compared to the S&P 500 over a specified time horizon.
  6. Portfolio Visualization:

    • Interactive graphs generated using the dygraphs package provide visual insights into the historical performance of the constructed portfolio and its comparison with the S&P 500.
  7. Portfolio Growth and Flexibility:

    • The author showcases the flexibility of the approach by adjusting weights, adding stocks, or experimenting with re-balancing. The presented R code allows for virtually unlimited possibilities in constructing and testing portfolios.

In essence, the article guides readers through the process of leveraging data science tools in R to make informed investment decisions, showcasing the practicality and versatility of the presented approach. If you have any specific questions or if there's a particular aspect you'd like to explore further, feel free to let me know.

Building and Testing Stock Portfolios in R (2024)
Top Articles
Latest Posts
Article information

Author: Mr. See Jast

Last Updated:

Views: 5335

Rating: 4.4 / 5 (55 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Mr. See Jast

Birthday: 1999-07-30

Address: 8409 Megan Mountain, New Mathew, MT 44997-8193

Phone: +5023589614038

Job: Chief Executive

Hobby: Leather crafting, Flag Football, Candle making, Flying, Poi, Gunsmithing, Swimming

Introduction: My name is Mr. See Jast, I am a open, jolly, gorgeous, courageous, inexpensive, friendly, homely person who loves writing and wants to share my knowledge and understanding with you.