Downsizing (downsampling) or upsizing (upsampling) array in Python

55 Views Asked by At

I am relatively new to coding. Please ask questions if something is unclear!

I am generating an x-Array to use as an input for a beta-pdf. The x-values are dependent on the gradient. The higher the gradient the smaller my 'cell values' get, meaning the closer my x-values are to each other. The sum of all x-values must be one, since I want to use them for generating the beta-pdf with better quality. My problem is that the new x array comes in different length depending on the gradient. I want to have a target value for values in my x-array. For example 300 x values. If my original value is 1249 for example, meanwhile it needs to keep its relativ distribution otherwise the whole process of making it gradient depended is useless. Thank you guys!

input = 100 x values

output should be lets say 300 x-values but distributed accordingly to gradient

Small explanation what I did before,

import numpy as np
import scipy.stats as stats

b = 1
a = 0
werte = 100
alpha = 2  #for example
beta1 = 1.5
prozent = 10

x_array = np.linspace(0, 1, 100)
pdf_values = stats.beta.pdf(x_values, alpha, beta1)
differences = np.diff(pdf_values)
gradients = differences / ((b - a) / (werte - 1))
pdf_gradients_abs = np.abs(gradients)
pv_array = np.linspace(a, b, num=werte)
gradient_array = [pv_array[i + 1] - pv_array[i] for i in range(len(pv_array) - 1)]

for i in range(len(gradient_array)):
    gradient_array[i] = gradient_array[i] / 2
    gradient_array[i] = 1 / ((werte - 1) * 2) + i * (1 / (werte - 1))
integral_pdf_gradients = np.trapz(gradient_array, y=pdf_gradients_abs)

c_delta = (prozent / 100) * integral_pdf_gradients
f_delta = 1 / (pdf_gradients_abs + c_delta)
integral_f_delta = np.trapz(f_delta, gradient_array)
d_x_rel = f_delta / integral_f_delta  #  <- my 'cell width'
zellgroesse = 1 / (werte - 1)
for value in d_x_rel:
    if value >= zellgroesse:  # zellgroesse is the max width one cell can have due to that           being the gradient affected area
     x = zellgroesse
     neues_delta_x = np.append(neues_delta_x, x)
elif value < zellgroesse:
     n = zellgroesse / value
     n_round = int(np.rint(n))
     if n_round >= 1000:
        n_round = 1000
     x_cor = zellgroesse / n_round
     for i in range(n_round):
          neues_delta_x = np.append(neues_delta_x, x_cor)

output x-array, distributed in a way I need but it has too many values or not enough

I need 300 x-array values, this might have 1539 values!

1

There are 1 best solutions below

0
mond On
  • as you already did: first sample your function
  • then calculate the absolute steepness
  • sum up those steepnesses with cumsum
  • the resulting sequence tells you about
  • how many points you should have up to that x value
  • scale the function to the number of points you want
  • use np.interp to interpolate that function
  • use it to create an array of indices
import numpy as np
from scipy import stats

# originally sample the beta pdf with the parameters you want
# here 2.5 and 0.6
# it seems the pdf gets an infinity value at 1.0

xv = np.linspace(0, 1, 100)
pdf_values = stats.beta.pdf(xv, 2.5, 0.6)

# we want to know where the absolut value of the gradient is high
# in order to place more points there
#
dpf=np.abs(np.diff(pdf_values))

# but we cut the steepnes to a max (here 10) 
dpf=np.where(dpf > 10,10,dpf)

# adding up the steepnes values and
# padding it with a zero at the beginning
xp=np.cumsum(dpf)
xp=np.pad(xp,(1,0))

# now we interpolate this function with the pv
# function. we also scale it to the number of points
# we want at the maximum. here 300.
# note that this is the "inverse" function.
# the scaled xp functio would tell us how many
# points we would want up to that x vale.
# with the inverse (x and y swaped)
# we can plug in 0 to index-1
xmax=xp[-1]
pv=lambda vec:np.interp(vec,xp*300/xmax,xv)

# now we calculate our new x values
# the 300 here should match the 300 above
idx=np.arange(300)
newx=pv(idx)

#