Original Question: I have downloaded Stock data with yfinance for quite a while where I add the latest data to a previous csv file not to download all data every time. As a Stock split disturb my calculations and Curves I would like to divide all stock values before the split with a constant called "Split". I am aware that I can download the complete data again and get it correct from Yahoo Finance but I would like to it manually. My plan is, first to split the complete Dataframe to two. One before the Stock split and one the remaining. This part is working. Later I would like to merge the two Dataframes again. The first Column is "Date". Then the Columns "Open", "High", "Low", "Close", "Adj Close" should be divided with the split value and the last Column "Volume" should be multiplied with the same value. Only in the first part of the DataFrame.

Thanks to the comments from @cottontail and @luka1156 I managed to get it working. It might not be the most beautiful solution but it is working. First I make one DataFrame only with all "Date", then I spit the complete DataFrame in two, One before and one after. In these two I remove the "Date" Column. At this point I could do the math with the DataFrame before the split as it has only float values. After the math operation I appended the part after the split to it and finally I used concat to to the the complete file without "Date" and the file with all "Date". My code ended up like this. In order to follow what happen I write all files during the process but it might not be necessary.

    data = pd.read_csv(stockFile,  header=0)

    if os.stat(stockFile).st_size > 4:
        data['Date'] = pd.to_datetime(data['Date'])
        
        frafilnavn = stockFile
        print ("frafilnavn = ",  frafilnavn)
        Rigtige = 'J'

        Rigtige = str(input('Er dette den rigtige fil? J/N   *\n'))
        while True:
            if Rigtige == 'J':
                print('Indtast tidspunkt for Aktiesplit yyyy,m,d   ')

                Splitdato = year,month,day = input().split(',')
                Splitdato = datetime.datetime(year = int(year), month = int(month), day = int(day))

                Split = float(input('Antal nye Aktier for hver gamle?  *\n'))
                print("value of Split: ",Split)                 
                all_onlyDates = frafilnavn
                all_onlyDates += '_Complet Stock only Dates.csv'
                print ("All only Dates = ", all_onlyDates)
                data['Date'].to_csv(all_onlyDates, columns=['Date'],  index=0)
                
                df=data['Before-Split'] = (data['Date'] <= Splitdato)  
                BeforeSplitdata=data[data['Before-Split']]
                BeforeSplitdata
                print(BeforeSplitdata)
                
                before_allColumns = frafilnavn
                before_allColumns += '_Before Split.csv'
                print ("Before Split all Columns  = ", before_allColumns)
                BeforeSplitdata.to_csv(before_allColumns,  index=0)
                
                beforeValues = frafilnavn
                beforeValues += '_Before Split only Values.csv'
                print ("Before Split only Values = ", beforeValues)
                BeforeSplitdata.to_csv(beforeValues, columns=['Open',  'High',  'Low',  'Close',  'Adj Close',  'Volume'], index=0)
                
                
                df2=data['After-Split'] = (data['Date'] > Splitdato)
                AfterSplitdata = data[data['After-Split']]
                AfterSplitdata
                print(AfterSplitdata)
                
                after_allColumns = frafilnavn
                after_allColumns += '_After Split.csv'
                print ("After Split all Columns = ", after_allColumns)
                AfterSplitdata.to_csv(after_allColumns,  index=0)
                
                afterValues = frafilnavn
                afterValues += '_After Split only Values.csv'
                print ("After Split only Values = ", afterValues)
                AfterSplitdata.to_csv(afterValues, columns=['Open',  'High',  'Low',  'Close',  'Adj Close',  'Volume'], index=0)             
                
                dfbeforesplit = pd.read_csv(beforeValues,  header=0)        
     
                print(dfbeforesplit)    

                dfbeforesplit_div = dfbeforesplit/Split
                print('dfbeforesplit_div', dfbeforesplit_div)    
                dfbeforesplit_vol = dfbeforesplit_div['Volume']*Split*Split
                print('dfbeforesplit_vol', dfbeforesplit_vol)   
                
                Beforesplit_div = frafilnavn
                Beforesplit_div += '_Before Split divided Values.csv'                    
                dfbeforesplit_div.to_csv(Beforesplit_div , columns=['Open',  'High',  'Low',  'Close',  'Adj Close'],  index=0)                      

                dfdivsplit = pd.read_csv(Beforesplit_div,  header=0)

                dfcorrect = dfbeforesplit_div.join(dfbeforesplit_vol, rsuffix='Volume')
                
                
                print('Corrected values', dfcorrect)
                
                Corrected = frafilnavn
                Corrected += '_Before Split corrected Values.csv'
                print ("Before Split corrected Values = ", Corrected)
                dfcorrect = dfcorrect.rename(columns={'Volume': 'OldVolume', 'VolumeVolume': 'Volume'})
                dfcorrect.to_csv(Corrected, columns=['Open',  'High',  'Low',  'Close',  'Adj Close',  'Volume'],  index=0)  
            
                dfcorrectsplit = pd.read_csv(Corrected,  header=0)
                Corrected = frafilnavn
                Corrected += '_Before Split corrected Values.csv'
                print ("Indlæst fil er = ", Corrected)    

                
                frafil = afterValues
                print ("frafil = ", frafil)
                       
                
                tilfil = Corrected
                print ("tilfil = ", tilfil)
                
                bckupfil = tilfil
                bckupfil += '_bckup.csv'                    
                 
                file_in = frafil #'in.csv'
                file_out = tilfil #'out.csv'
                copyfile(tilfil, bckupfil)



                with open(file_in,  "r" ) as f_input, open(file_out, "a") as f_output : #, open(file_temp, "w") as f_temp:
                    csv_input = csv.reader(f_input)

                    csv.writer(f_output).writerows(itertools.islice(csv_input, 1, 500))
                    
                with open(tilfil) as csvfile:
                    print(tilfil)

                    KompletuDato = Corrected
                    
                    data1 = pd.read_csv(KompletuDato,  header=0)
                    print(KompletuDato)
                    data2 = pd.read_csv(all_onlyDates,  header=0)
                
                    resultFile = stockFile
                    resultFile += '_færdig.csv'
                    result = pd.concat([data2, data1], axis=1) 
                    result.to_csv(resultFile,  index=0)
                    print(result)
                    os.rename(resultFile, FinishFile)

In the directory I get a file with the same name as I called but with Values corrected. I have also a backup file of the original just in case. I will try the last comments from luka1156 also but here is what I did.

1

There are 1 best solutions below

8
On

To create a new DataFrame with the before and after split values you could just do:

df = df.set_index('Date')
split_n = 2
dfsplit = df/split_n 
dfsplit['Volume'] = dfsplit['Volume']*split_n*split_n

df1 = df.join(dfsplit, rsuffix='_Split')