Message ID | 20210908135356.18689-1-macroalpha82@gmail.com (mailing list archive) |
---|---|

State | New |

Headers | show |

Series | drm/rockchip: Update crtc fixup to account for fractional clk change | expand |

On Wed, Sep 08, 2021 at 08:53:56AM -0500, Chris Morgan wrote: > From: Chris Morgan <macromorgan@hotmail.com> > > After commit 928f9e268611 ("clk: fractional-divider: Hide > clk_fractional_divider_ops from wide audience") was merged it appears > that the DSI panel on my Odroid Go Advance stopped working. Upon closer > examination of the problem, it looks like it was the fixup in the > rockchip_drm_vop.c file was causing the issue. The changes made to the > clk driver appear to change some assumptions made in the fixup. > > After debugging the working 5.14 kernel and the no-longer working > 5.15 kernel, it looks like this was broken all along but still > worked, whereas after the fractional clock change it stopped > working despite the issue (it went from sort-of broken to very broken). > > In the 5.14 kernel the dclk_vopb_frac was being requested to be set to > 17000999 on my board. The clock driver was taking the value of the > parent clock and attempting to divide the requested value from it > (17000000/17000999 = 0), then subtracting 1 from it (making it -1), > and running it through fls_long to get 64. It would then subtract > the value of fd->mwidth from it to get 48, and then bit shift > 17000999 to the left by 48, coming up with a very large number of > 7649082492112076800. This resulted in a numerator of 65535 and a > denominator of 1 from the clk driver. The driver seemingly would > try again and get a correct 1:1 value later, and then move on. > > Output from my 5.14 kernel (with some printfs for good measure): > [ 2.830066] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > [ 2.839431] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > [ 2.855980] Clock is dclk_vopb_frac > [ 2.856004] Scale 64, Rate 7649082492112076800, Oldrate 17000999, Parent Rate 17000000, Best Numerator 65535, Best Denominator 1, fd->mwidth 16 > [ 2.903529] Clock is dclk_vopb_frac > [ 2.903556] Scale 0, Rate 17000000, Oldrate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1, fd->mwidth 16 > [ 2.903579] Clock is dclk_vopb_frac > [ 2.903583] Scale 0, Rate 17000000, Oldrate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1, fd->mwidth 16 > > Contrast this with 5.15 after the clk change where the rate of 17000999 > was getting passed and resulted in numerators/denomiators of 17001/ > 17000. > > Output from my 5.15 kernel (with some printfs added for good measure): > [ 2.817571] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > [ 2.826975] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > [ 2.843430] Rate 17000999, Parent Rate 17000000, Best Numerator 17018, Best Denominator 17017 > [ 2.891073] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > [ 2.891269] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > [ 2.891281] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > > After tracing through the code it appeared that this function here was > adding a 999 to the requested frequency because of how the clk driver > was rounding/accepting those frequencies. I believe after the changes > made in the commit listed above the assumptions listed in this driver > are no longer true. When I remove the + 999 from the driver the DSI > panel begins to work again. > > Output from my 5.15 kernel with 999 removed (printfs added): > [ 2.852054] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > [ 2.864483] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > [ 2.880869] Clock is dclk_vopb_frac > [ 2.880892] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > [ 2.928521] Clock is dclk_vopb_frac > [ 2.928551] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > [ 2.928570] Clock is dclk_vopb_frac > [ 2.928574] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > > I have tested the change extensively on my Odroid Go Advance (Rockchip > RK3326) and it appears to work well. However, this change will affect > all Rockchip SoCs that use this driver so I believe further testing > is warranted. Please note that without this change I can confirm > at least all PX30s with DSI panels will stop working with the 5.15 > kernel. To me it all makes a lot of sense, thank you for deep analysis of the issue! In any case I think we will need a Fixes tag to something (either one of clk-fractional-divider.c series or preexisted). Anyway, FWIW, Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > Signed-off-by: Chris Morgan <macromorgan@hotmail.com> > --- > drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 21 +++------------------ > 1 file changed, 3 insertions(+), 18 deletions(-) > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > index ba9e14da41b4..bfef4f52dce6 100644 > --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > @@ -1169,31 +1169,16 @@ static bool vop_crtc_mode_fixup(struct drm_crtc *crtc, > * > * - DRM works in in kHz. > * - Clock framework works in Hz. > - * - Rockchip's clock driver picks the clock rate that is the > - * same _OR LOWER_ than the one requested. > * > * Action plan: > * > - * 1. When DRM gives us a mode, we should add 999 Hz to it. That way > - * if the clock we need is 60000001 Hz (~60 MHz) and DRM tells us to > - * make 60000 kHz then the clock framework will actually give us > - * the right clock. > - * > - * NOTE: if the PLL (maybe through a divider) could actually make > - * a clock rate 999 Hz higher instead of the one we want then this > - * could be a problem. Unfortunately there's not much we can do > - * since it's baked into DRM to use kHz. It shouldn't matter in > - * practice since Rockchip PLLs are controlled by tables and > - * even if there is a divider in the middle I wouldn't expect PLL > - * rates in the table that are just a few kHz different. > - * > - * 2. Get the clock framework to round the rate for us to tell us > + * 1. Get the clock framework to round the rate for us to tell us > * what it will actually make. > * > - * 3. Store the rounded up rate so that we don't need to worry about > + * 2. Store the rounded up rate so that we don't need to worry about > * this in the actual clk_set_rate(). > */ > - rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000 + 999); > + rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000); > adjusted_mode->clock = DIV_ROUND_UP(rate, 1000); > > return true; > -- > 2.25.1 >

On Wed, Sep 08, 2021 at 09:05:52PM +0300, Andy Shevchenko wrote: > On Wed, Sep 08, 2021 at 08:53:56AM -0500, Chris Morgan wrote: > > From: Chris Morgan <macromorgan@hotmail.com> > > > > After commit 928f9e268611 ("clk: fractional-divider: Hide > > clk_fractional_divider_ops from wide audience") was merged it appears > > that the DSI panel on my Odroid Go Advance stopped working. Upon closer > > examination of the problem, it looks like it was the fixup in the > > rockchip_drm_vop.c file was causing the issue. The changes made to the > > clk driver appear to change some assumptions made in the fixup. > > > > After debugging the working 5.14 kernel and the no-longer working > > 5.15 kernel, it looks like this was broken all along but still > > worked, whereas after the fractional clock change it stopped > > working despite the issue (it went from sort-of broken to very broken). > > > > In the 5.14 kernel the dclk_vopb_frac was being requested to be set to > > 17000999 on my board. The clock driver was taking the value of the > > parent clock and attempting to divide the requested value from it > > (17000000/17000999 = 0), then subtracting 1 from it (making it -1), > > and running it through fls_long to get 64. It would then subtract > > the value of fd->mwidth from it to get 48, and then bit shift > > 17000999 to the left by 48, coming up with a very large number of > > 7649082492112076800. This resulted in a numerator of 65535 and a > > denominator of 1 from the clk driver. The driver seemingly would > > try again and get a correct 1:1 value later, and then move on. > > > > Output from my 5.14 kernel (with some printfs for good measure): > > [ 2.830066] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > > [ 2.839431] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > > [ 2.855980] Clock is dclk_vopb_frac > > [ 2.856004] Scale 64, Rate 7649082492112076800, Oldrate 17000999, Parent Rate 17000000, Best Numerator 65535, Best Denominator 1, fd->mwidth 16 > > [ 2.903529] Clock is dclk_vopb_frac > > [ 2.903556] Scale 0, Rate 17000000, Oldrate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1, fd->mwidth 16 > > [ 2.903579] Clock is dclk_vopb_frac > > [ 2.903583] Scale 0, Rate 17000000, Oldrate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1, fd->mwidth 16 > > > > Contrast this with 5.15 after the clk change where the rate of 17000999 > > was getting passed and resulted in numerators/denomiators of 17001/ > > 17000. > > > > Output from my 5.15 kernel (with some printfs added for good measure): > > [ 2.817571] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > > [ 2.826975] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > > [ 2.843430] Rate 17000999, Parent Rate 17000000, Best Numerator 17018, Best Denominator 17017 > > [ 2.891073] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > > [ 2.891269] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > > [ 2.891281] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > > > > After tracing through the code it appeared that this function here was > > adding a 999 to the requested frequency because of how the clk driver > > was rounding/accepting those frequencies. I believe after the changes > > made in the commit listed above the assumptions listed in this driver > > are no longer true. When I remove the + 999 from the driver the DSI > > panel begins to work again. > > > > Output from my 5.15 kernel with 999 removed (printfs added): > > [ 2.852054] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > > [ 2.864483] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > > [ 2.880869] Clock is dclk_vopb_frac > > [ 2.880892] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > > [ 2.928521] Clock is dclk_vopb_frac > > [ 2.928551] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > > [ 2.928570] Clock is dclk_vopb_frac > > [ 2.928574] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > > > > I have tested the change extensively on my Odroid Go Advance (Rockchip > > RK3326) and it appears to work well. However, this change will affect > > all Rockchip SoCs that use this driver so I believe further testing > > is warranted. Please note that without this change I can confirm > > at least all PX30s with DSI panels will stop working with the 5.15 > > kernel. > > To me it all makes a lot of sense, thank you for deep analysis of the issue! > In any case I think we will need a Fixes tag to something (either one of > clk-fractional-divider.c series or preexisted). Would this work for a "fixes"? commit 287422a95fe2 ("drm/rockchip: Round up _before_ giving to the clock framework") That's the commit that introduced this "bug", even though it wasn't a problem until the new commit that changed the way fractional clocks worked for rockchip. Thank you. > > Anyway, FWIW, > Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > > > Signed-off-by: Chris Morgan <macromorgan@hotmail.com> > > --- > > drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 21 +++------------------ > > 1 file changed, 3 insertions(+), 18 deletions(-) > > > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > > index ba9e14da41b4..bfef4f52dce6 100644 > > --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > > @@ -1169,31 +1169,16 @@ static bool vop_crtc_mode_fixup(struct drm_crtc *crtc, > > * > > * - DRM works in in kHz. > > * - Clock framework works in Hz. > > - * - Rockchip's clock driver picks the clock rate that is the > > - * same _OR LOWER_ than the one requested. > > * > > * Action plan: > > * > > - * 1. When DRM gives us a mode, we should add 999 Hz to it. That way > > - * if the clock we need is 60000001 Hz (~60 MHz) and DRM tells us to > > - * make 60000 kHz then the clock framework will actually give us > > - * the right clock. > > - * > > - * NOTE: if the PLL (maybe through a divider) could actually make > > - * a clock rate 999 Hz higher instead of the one we want then this > > - * could be a problem. Unfortunately there's not much we can do > > - * since it's baked into DRM to use kHz. It shouldn't matter in > > - * practice since Rockchip PLLs are controlled by tables and > > - * even if there is a divider in the middle I wouldn't expect PLL > > - * rates in the table that are just a few kHz different. > > - * > > - * 2. Get the clock framework to round the rate for us to tell us > > + * 1. Get the clock framework to round the rate for us to tell us > > * what it will actually make. > > * > > - * 3. Store the rounded up rate so that we don't need to worry about > > + * 2. Store the rounded up rate so that we don't need to worry about > > * this in the actual clk_set_rate(). > > */ > > - rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000 + 999); > > + rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000); > > adjusted_mode->clock = DIV_ROUND_UP(rate, 1000); > > > > return true; > > -- > > 2.25.1 > > > > -- > With Best Regards, > Andy Shevchenko > >

Hi, Am Mittwoch, 8. September 2021, 15:53:56 CEST schrieb Chris Morgan: > From: Chris Morgan <macromorgan@hotmail.com> > > After commit 928f9e268611 ("clk: fractional-divider: Hide > clk_fractional_divider_ops from wide audience") was merged it appears > that the DSI panel on my Odroid Go Advance stopped working. Upon closer > examination of the problem, it looks like it was the fixup in the > rockchip_drm_vop.c file was causing the issue. The changes made to the > clk driver appear to change some assumptions made in the fixup. > > After debugging the working 5.14 kernel and the no-longer working > 5.15 kernel, it looks like this was broken all along but still > worked, whereas after the fractional clock change it stopped > working despite the issue (it went from sort-of broken to very broken). > > In the 5.14 kernel the dclk_vopb_frac was being requested to be set to > 17000999 on my board. The clock driver was taking the value of the > parent clock and attempting to divide the requested value from it > (17000000/17000999 = 0), then subtracting 1 from it (making it -1), > and running it through fls_long to get 64. It would then subtract > the value of fd->mwidth from it to get 48, and then bit shift > 17000999 to the left by 48, coming up with a very large number of > 7649082492112076800. This resulted in a numerator of 65535 and a > denominator of 1 from the clk driver. The driver seemingly would > try again and get a correct 1:1 value later, and then move on. > > Output from my 5.14 kernel (with some printfs for good measure): > [ 2.830066] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > [ 2.839431] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > [ 2.855980] Clock is dclk_vopb_frac > [ 2.856004] Scale 64, Rate 7649082492112076800, Oldrate 17000999, Parent Rate 17000000, Best Numerator 65535, Best Denominator 1, fd->mwidth 16 > [ 2.903529] Clock is dclk_vopb_frac > [ 2.903556] Scale 0, Rate 17000000, Oldrate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1, fd->mwidth 16 > [ 2.903579] Clock is dclk_vopb_frac > [ 2.903583] Scale 0, Rate 17000000, Oldrate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1, fd->mwidth 16 > > Contrast this with 5.15 after the clk change where the rate of 17000999 > was getting passed and resulted in numerators/denomiators of 17001/ > 17000. > > Output from my 5.15 kernel (with some printfs added for good measure): > [ 2.817571] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > [ 2.826975] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > [ 2.843430] Rate 17000999, Parent Rate 17000000, Best Numerator 17018, Best Denominator 17017 > [ 2.891073] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > [ 2.891269] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > [ 2.891281] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > > After tracing through the code it appeared that this function here was > adding a 999 to the requested frequency because of how the clk driver > was rounding/accepting those frequencies. I believe after the changes > made in the commit listed above the assumptions listed in this driver > are no longer true. When I remove the + 999 from the driver the DSI > panel begins to work again. > > Output from my 5.15 kernel with 999 removed (printfs added): > [ 2.852054] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > [ 2.864483] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > [ 2.880869] Clock is dclk_vopb_frac > [ 2.880892] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > [ 2.928521] Clock is dclk_vopb_frac > [ 2.928551] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > [ 2.928570] Clock is dclk_vopb_frac > [ 2.928574] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > > I have tested the change extensively on my Odroid Go Advance (Rockchip > RK3326) and it appears to work well. However, this change will affect > all Rockchip SoCs that use this driver so I believe further testing > is warranted. Please note that without this change I can confirm > at least all PX30s with DSI panels will stop working with the 5.15 > kernel. > > Signed-off-by: Chris Morgan <macromorgan@hotmail.com> With 5.15-rc1 on: - rk3288-pinky+eDP (was working -> still working) - rk3399-kevin+eDP (was working -> still working) - px30-minievb (was broken -> working again) Tested-by: Heiko Stuebner <heiko@sntech.de> I've also added Doug, maybe he remembers some historic artifact to keep in mind about the original patch. Heiko > --- > drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 21 +++------------------ > 1 file changed, 3 insertions(+), 18 deletions(-) > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > index ba9e14da41b4..bfef4f52dce6 100644 > --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > @@ -1169,31 +1169,16 @@ static bool vop_crtc_mode_fixup(struct drm_crtc *crtc, > * > * - DRM works in in kHz. > * - Clock framework works in Hz. > - * - Rockchip's clock driver picks the clock rate that is the > - * same _OR LOWER_ than the one requested. > * > * Action plan: > * > - * 1. When DRM gives us a mode, we should add 999 Hz to it. That way > - * if the clock we need is 60000001 Hz (~60 MHz) and DRM tells us to > - * make 60000 kHz then the clock framework will actually give us > - * the right clock. > - * > - * NOTE: if the PLL (maybe through a divider) could actually make > - * a clock rate 999 Hz higher instead of the one we want then this > - * could be a problem. Unfortunately there's not much we can do > - * since it's baked into DRM to use kHz. It shouldn't matter in > - * practice since Rockchip PLLs are controlled by tables and > - * even if there is a divider in the middle I wouldn't expect PLL > - * rates in the table that are just a few kHz different. > - * > - * 2. Get the clock framework to round the rate for us to tell us > + * 1. Get the clock framework to round the rate for us to tell us > * what it will actually make. > * > - * 3. Store the rounded up rate so that we don't need to worry about > + * 2. Store the rounded up rate so that we don't need to worry about > * this in the actual clk_set_rate(). > */ > - rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000 + 999); > + rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000); > adjusted_mode->clock = DIV_ROUND_UP(rate, 1000); > > return true; >

Hi, On Tue, Sep 14, 2021 at 4:14 AM Heiko Stübner <heiko@sntech.de> wrote: > > Hi, > > Am Mittwoch, 8. September 2021, 15:53:56 CEST schrieb Chris Morgan: > > From: Chris Morgan <macromorgan@hotmail.com> > > > > After commit 928f9e268611 ("clk: fractional-divider: Hide > > clk_fractional_divider_ops from wide audience") was merged it appears > > that the DSI panel on my Odroid Go Advance stopped working. Upon closer > > examination of the problem, it looks like it was the fixup in the > > rockchip_drm_vop.c file was causing the issue. The changes made to the > > clk driver appear to change some assumptions made in the fixup. > > > > After debugging the working 5.14 kernel and the no-longer working > > 5.15 kernel, it looks like this was broken all along but still > > worked, whereas after the fractional clock change it stopped > > working despite the issue (it went from sort-of broken to very broken). > > > > In the 5.14 kernel the dclk_vopb_frac was being requested to be set to > > 17000999 on my board. The clock driver was taking the value of the > > parent clock and attempting to divide the requested value from it > > (17000000/17000999 = 0), then subtracting 1 from it (making it -1), > > and running it through fls_long to get 64. It would then subtract > > the value of fd->mwidth from it to get 48, and then bit shift > > 17000999 to the left by 48, coming up with a very large number of > > 7649082492112076800. This resulted in a numerator of 65535 and a > > denominator of 1 from the clk driver. The driver seemingly would > > try again and get a correct 1:1 value later, and then move on. > > > > Output from my 5.14 kernel (with some printfs for good measure): > > [ 2.830066] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > > [ 2.839431] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > > [ 2.855980] Clock is dclk_vopb_frac > > [ 2.856004] Scale 64, Rate 7649082492112076800, Oldrate 17000999, Parent Rate 17000000, Best Numerator 65535, Best Denominator 1, fd->mwidth 16 > > [ 2.903529] Clock is dclk_vopb_frac > > [ 2.903556] Scale 0, Rate 17000000, Oldrate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1, fd->mwidth 16 > > [ 2.903579] Clock is dclk_vopb_frac > > [ 2.903583] Scale 0, Rate 17000000, Oldrate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1, fd->mwidth 16 > > > > Contrast this with 5.15 after the clk change where the rate of 17000999 > > was getting passed and resulted in numerators/denomiators of 17001/ > > 17000. > > > > Output from my 5.15 kernel (with some printfs added for good measure): > > [ 2.817571] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > > [ 2.826975] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > > [ 2.843430] Rate 17000999, Parent Rate 17000000, Best Numerator 17018, Best Denominator 17017 > > [ 2.891073] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > > [ 2.891269] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > > [ 2.891281] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > > > > After tracing through the code it appeared that this function here was > > adding a 999 to the requested frequency because of how the clk driver > > was rounding/accepting those frequencies. I believe after the changes > > made in the commit listed above the assumptions listed in this driver > > are no longer true. When I remove the + 999 from the driver the DSI > > panel begins to work again. > > > > Output from my 5.15 kernel with 999 removed (printfs added): > > [ 2.852054] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > > [ 2.864483] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > > [ 2.880869] Clock is dclk_vopb_frac > > [ 2.880892] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > > [ 2.928521] Clock is dclk_vopb_frac > > [ 2.928551] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > > [ 2.928570] Clock is dclk_vopb_frac > > [ 2.928574] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > > > > I have tested the change extensively on my Odroid Go Advance (Rockchip > > RK3326) and it appears to work well. However, this change will affect > > all Rockchip SoCs that use this driver so I believe further testing > > is warranted. Please note that without this change I can confirm > > at least all PX30s with DSI panels will stop working with the 5.15 > > kernel. > > > > Signed-off-by: Chris Morgan <macromorgan@hotmail.com> > > With 5.15-rc1 on: > - rk3288-pinky+eDP (was working -> still working) > - rk3399-kevin+eDP (was working -> still working) > - px30-minievb (was broken -> working again) > > Tested-by: Heiko Stuebner <heiko@sntech.de> > > > I've also added Doug, maybe he remembers some historic artifact > to keep in mind about the original patch. So when you say "was working" -> "still working", how did you test that? Did you just confirm that something was showing up on the screen, or did you check debugfs and make sure that the PLLs were all being set correctly? Also, I can't quite remember if the problems here were related to the internal display (and on which rockchip device) or HDMI where we could have a whole lot more possible pixel clocks. I can dig if need be. ...but I'm fairly certain that we can't just delete the "+ 999" and expect everything to work. All of the stuff in the comment above the "+ 999" is still as true today as it was when I wrote it. DRM is still in kHz and the Rockchip clock driver still rounds down. I suspect that the problem here is that the logic I wrote up just doesn't work great if your display clock is made by a "frac" clock. That's pretty much what I was saying when I wrote the comment: > NOTE: if the PLL (maybe through a divider) could actually make > a clock rate 999 Hz higher instead of the one we want then this > could be a problem. Maybe we can come up with a solution, though. So one thing is that I'd suspect that the problem is actually a bug with the fractional clock driver. "Best Numerator 17001, Best Denominator 17000" is probably not a valid thing to set and the clock driver should know this and seek out a different rate. I haven't dug through your code paths but, for instance, the comments above rockchip_fractional_approximation() say that denominator must be 20x the numerator if you want a good clock. That's clearly not the case here. There could also be maximum values of the numerator / denominator that are being ignored. I would sorta bet that if the frac clock driver was fixed that a clock anywhere between 17000000 and 17000999 would work just fine for you. In any case, despite the clock driver being screwy, probably this would fix it for you (untested) and is a better solution I think: rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000); if (rate / 1000 != adjusted_mode->clock) rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000 + 999); Basically: if we can happen to make the rate exactly then we're good. Otherwise then try bumping up by 999. > > --- > > drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 21 +++------------------ > > 1 file changed, 3 insertions(+), 18 deletions(-) > > > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > > index ba9e14da41b4..bfef4f52dce6 100644 > > --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > > @@ -1169,31 +1169,16 @@ static bool vop_crtc_mode_fixup(struct drm_crtc *crtc, > > * > > * - DRM works in in kHz. > > * - Clock framework works in Hz. > > - * - Rockchip's clock driver picks the clock rate that is the > > - * same _OR LOWER_ than the one requested. > > * > > * Action plan: > > * > > - * 1. When DRM gives us a mode, we should add 999 Hz to it. That way > > - * if the clock we need is 60000001 Hz (~60 MHz) and DRM tells us to > > - * make 60000 kHz then the clock framework will actually give us > > - * the right clock. > > - * > > - * NOTE: if the PLL (maybe through a divider) could actually make > > - * a clock rate 999 Hz higher instead of the one we want then this > > - * could be a problem. Unfortunately there's not much we can do > > - * since it's baked into DRM to use kHz. It shouldn't matter in > > - * practice since Rockchip PLLs are controlled by tables and > > - * even if there is a divider in the middle I wouldn't expect PLL > > - * rates in the table that are just a few kHz different. > > - * > > - * 2. Get the clock framework to round the rate for us to tell us > > + * 1. Get the clock framework to round the rate for us to tell us > > * what it will actually make. > > * > > - * 3. Store the rounded up rate so that we don't need to worry about > > + * 2. Store the rounded up rate so that we don't need to worry about > > * this in the actual clk_set_rate(). > > */ > > - rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000 + 999); > > + rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000); > > adjusted_mode->clock = DIV_ROUND_UP(rate, 1000); > > > > return true; > > > > > >

On Thu, Sep 16, 2021 at 10:08:05AM -0700, Doug Anderson wrote: > Hi, > > On Tue, Sep 14, 2021 at 4:14 AM Heiko Stübner <heiko@sntech.de> wrote: > > > > Hi, > > > > Am Mittwoch, 8. September 2021, 15:53:56 CEST schrieb Chris Morgan: > > > From: Chris Morgan <macromorgan@hotmail.com> > > > > > > After commit 928f9e268611 ("clk: fractional-divider: Hide > > > clk_fractional_divider_ops from wide audience") was merged it appears > > > that the DSI panel on my Odroid Go Advance stopped working. Upon closer > > > examination of the problem, it looks like it was the fixup in the > > > rockchip_drm_vop.c file was causing the issue. The changes made to the > > > clk driver appear to change some assumptions made in the fixup. > > > > > > After debugging the working 5.14 kernel and the no-longer working > > > 5.15 kernel, it looks like this was broken all along but still > > > worked, whereas after the fractional clock change it stopped > > > working despite the issue (it went from sort-of broken to very broken). > > > > > > In the 5.14 kernel the dclk_vopb_frac was being requested to be set to > > > 17000999 on my board. The clock driver was taking the value of the > > > parent clock and attempting to divide the requested value from it > > > (17000000/17000999 = 0), then subtracting 1 from it (making it -1), > > > and running it through fls_long to get 64. It would then subtract > > > the value of fd->mwidth from it to get 48, and then bit shift > > > 17000999 to the left by 48, coming up with a very large number of > > > 7649082492112076800. This resulted in a numerator of 65535 and a > > > denominator of 1 from the clk driver. The driver seemingly would > > > try again and get a correct 1:1 value later, and then move on. > > > > > > Output from my 5.14 kernel (with some printfs for good measure): > > > [ 2.830066] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > > > [ 2.839431] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > > > [ 2.855980] Clock is dclk_vopb_frac > > > [ 2.856004] Scale 64, Rate 7649082492112076800, Oldrate 17000999, Parent Rate 17000000, Best Numerator 65535, Best Denominator 1, fd->mwidth 16 > > > [ 2.903529] Clock is dclk_vopb_frac > > > [ 2.903556] Scale 0, Rate 17000000, Oldrate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1, fd->mwidth 16 > > > [ 2.903579] Clock is dclk_vopb_frac > > > [ 2.903583] Scale 0, Rate 17000000, Oldrate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1, fd->mwidth 16 > > > > > > Contrast this with 5.15 after the clk change where the rate of 17000999 > > > was getting passed and resulted in numerators/denomiators of 17001/ > > > 17000. > > > > > > Output from my 5.15 kernel (with some printfs added for good measure): > > > [ 2.817571] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > > > [ 2.826975] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > > > [ 2.843430] Rate 17000999, Parent Rate 17000000, Best Numerator 17018, Best Denominator 17017 > > > [ 2.891073] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > > > [ 2.891269] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > > > [ 2.891281] Rate 17001000, Parent Rate 17000000, Best Numerator 17001, Best Denominator 17000 > > > > > > After tracing through the code it appeared that this function here was > > > adding a 999 to the requested frequency because of how the clk driver > > > was rounding/accepting those frequencies. I believe after the changes > > > made in the commit listed above the assumptions listed in this driver > > > are no longer true. When I remove the + 999 from the driver the DSI > > > panel begins to work again. > > > > > > Output from my 5.15 kernel with 999 removed (printfs added): > > > [ 2.852054] rockchip-drm display-subsystem: bound ff460000.vop (ops vop_component_ops) > > > [ 2.864483] rockchip-drm display-subsystem: bound ff450000.dsi (ops dw_mipi_dsi_rockchip_ops) > > > [ 2.880869] Clock is dclk_vopb_frac > > > [ 2.880892] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > > > [ 2.928521] Clock is dclk_vopb_frac > > > [ 2.928551] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > > > [ 2.928570] Clock is dclk_vopb_frac > > > [ 2.928574] Rate 17000000, Parent Rate 17000000, Best Numerator 1, Best Denominator 1 > > > > > > I have tested the change extensively on my Odroid Go Advance (Rockchip > > > RK3326) and it appears to work well. However, this change will affect > > > all Rockchip SoCs that use this driver so I believe further testing > > > is warranted. Please note that without this change I can confirm > > > at least all PX30s with DSI panels will stop working with the 5.15 > > > kernel. > > > > > > Signed-off-by: Chris Morgan <macromorgan@hotmail.com> > > > > With 5.15-rc1 on: > > - rk3288-pinky+eDP (was working -> still working) > > - rk3399-kevin+eDP (was working -> still working) > > - px30-minievb (was broken -> working again) > > > > Tested-by: Heiko Stuebner <heiko@sntech.de> > > > > > > I've also added Doug, maybe he remembers some historic artifact > > to keep in mind about the original patch. > > So when you say "was working" -> "still working", how did you test > that? Did you just confirm that something was showing up on the > screen, or did you check debugfs and make sure that the PLLs were all > being set correctly? Also, I can't quite remember if the problems here > were related to the internal display (and on which rockchip device) or > HDMI where we could have a whole lot more possible pixel clocks. I can > dig if need be. ...but I'm fairly certain that we can't just delete > the "+ 999" and expect everything to work. All of the stuff in the > comment above the "+ 999" is still as true today as it was when I > wrote it. DRM is still in kHz and the Rockchip clock driver still > rounds down. > > I suspect that the problem here is that the logic I wrote up just > doesn't work great if your display clock is made by a "frac" clock. > That's pretty much what I was saying when I wrote the comment: > > > NOTE: if the PLL (maybe through a divider) could actually make > > a clock rate 999 Hz higher instead of the one we want then this > > could be a problem. > > Maybe we can come up with a solution, though. > > So one thing is that I'd suspect that the problem is actually a bug > with the fractional clock driver. "Best Numerator 17001, Best > Denominator 17000" is probably not a valid thing to set and the clock > driver should know this and seek out a different rate. I haven't dug > through your code paths but, for instance, the comments above > rockchip_fractional_approximation() say that denominator must be 20x > the numerator if you want a good clock. That's clearly not the case > here. There could also be maximum values of the numerator / > denominator that are being ignored. I would sorta bet that if the frac > clock driver was fixed that a clock anywhere between 17000000 and > 17000999 would work just fine for you. > > In any case, despite the clock driver being screwy, probably this > would fix it for you (untested) and is a better solution I think: > > rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000); > if (rate / 1000 != adjusted_mode->clock) > rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000 + 999); This appears to work for me at least. I'll resubmit a patch with this as the solution. Thank you. > > Basically: if we can happen to make the rate exactly then we're good. > Otherwise then try bumping up by 999. > > > > > --- > > > drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 21 +++------------------ > > > 1 file changed, 3 insertions(+), 18 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > > > index ba9e14da41b4..bfef4f52dce6 100644 > > > --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > > > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > > > @@ -1169,31 +1169,16 @@ static bool vop_crtc_mode_fixup(struct drm_crtc *crtc, > > > * > > > * - DRM works in in kHz. > > > * - Clock framework works in Hz. > > > - * - Rockchip's clock driver picks the clock rate that is the > > > - * same _OR LOWER_ than the one requested. > > > * > > > * Action plan: > > > * > > > - * 1. When DRM gives us a mode, we should add 999 Hz to it. That way > > > - * if the clock we need is 60000001 Hz (~60 MHz) and DRM tells us to > > > - * make 60000 kHz then the clock framework will actually give us > > > - * the right clock. > > > - * > > > - * NOTE: if the PLL (maybe through a divider) could actually make > > > - * a clock rate 999 Hz higher instead of the one we want then this > > > - * could be a problem. Unfortunately there's not much we can do > > > - * since it's baked into DRM to use kHz. It shouldn't matter in > > > - * practice since Rockchip PLLs are controlled by tables and > > > - * even if there is a divider in the middle I wouldn't expect PLL > > > - * rates in the table that are just a few kHz different. > > > - * > > > - * 2. Get the clock framework to round the rate for us to tell us > > > + * 1. Get the clock framework to round the rate for us to tell us > > > * what it will actually make. > > > * > > > - * 3. Store the rounded up rate so that we don't need to worry about > > > + * 2. Store the rounded up rate so that we don't need to worry about > > > * this in the actual clk_set_rate(). > > > */ > > > - rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000 + 999); > > > + rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000); > > > adjusted_mode->clock = DIV_ROUND_UP(rate, 1000); > > > > > > return true; > > > > > > > > > > >

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c index ba9e14da41b4..bfef4f52dce6 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c @@ -1169,31 +1169,16 @@ static bool vop_crtc_mode_fixup(struct drm_crtc *crtc, * * - DRM works in in kHz. * - Clock framework works in Hz. - * - Rockchip's clock driver picks the clock rate that is the - * same _OR LOWER_ than the one requested. * * Action plan: * - * 1. When DRM gives us a mode, we should add 999 Hz to it. That way - * if the clock we need is 60000001 Hz (~60 MHz) and DRM tells us to - * make 60000 kHz then the clock framework will actually give us - * the right clock. - * - * NOTE: if the PLL (maybe through a divider) could actually make - * a clock rate 999 Hz higher instead of the one we want then this - * could be a problem. Unfortunately there's not much we can do - * since it's baked into DRM to use kHz. It shouldn't matter in - * practice since Rockchip PLLs are controlled by tables and - * even if there is a divider in the middle I wouldn't expect PLL - * rates in the table that are just a few kHz different. - * - * 2. Get the clock framework to round the rate for us to tell us + * 1. Get the clock framework to round the rate for us to tell us * what it will actually make. * - * 3. Store the rounded up rate so that we don't need to worry about + * 2. Store the rounded up rate so that we don't need to worry about * this in the actual clk_set_rate(). */ - rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000 + 999); + rate = clk_round_rate(vop->dclk, adjusted_mode->clock * 1000); adjusted_mode->clock = DIV_ROUND_UP(rate, 1000); return true;